Data Mining Function – Regressions
Regressions is a data mining function that predicts a number.
Age, weight, distance, temperature, income, or sales could all predicted using regressions techniques.
For example, a regression model could use to predict children’s height, given their age, weight, and other factors.
Moreover, A regressions task begins with a data set in which the target values were known.
For example, a regressions model that predicts children’s height could be developed based on observed data for many children over a period of time.
Also, The data might trackage, height, weight, developmental milestones, family history, and so on.
Height would be the target, the other attributes would be the predictors, and the data for each child would constitute a case.
Regressions models tested by computing various statistics that measure the difference between the predicted values and the expected values.
It required understanding the mathematics used in regressions analysis to develop quality regression models for data mining.
The goal of regressions analysis is to determine the values of parameters for a function that cause the function to best fit a set of data observations that you provide.
It shows that regressions are the process of estimating the value of a continuous target (y) as a function (F) of one or more predictors (x1 , x2 , …, xn), a set of parameters (θ1 , θ2 , …, θn), and a measure of error (e).
y = F(x,θ) + e
The process of training a regressions model involves finding the best parameter values for the function that minimize a measure of the error.
- The simplest form of regressions to visualize is the linear regression with a single predictor.
- A linear regression technique can use if the relationship between x and y can approximate with a straight line.
- Linear regressions with a single predictor can express by the following equation.
- y = θ2x + θ1 + e
- The regressions parameters in the simple linear regression are:
- The slope of the line (θ ) — the angle between a data point and the regression line.
- The y-intercept (θ ) — the point where x crosses the y-axis (x = 0)
- Often the relationship between x and y cannot approximate with a straight line.
- In this case, a nonlinear regressions technique may use. Alternatively, the data could preprocess to make the relationship linear.