Given our usual dataset made of input vectors and output values, the design matrix is the matrix whose rows are the input vectors.
Definition
Let Strain be a dataset made of paired input vectors →xi and output values yi:
Strain={(→xi,yi)∣i≤n}This is the usual training set for most of our supervised-learning articles, such as the OLS regression.
The design matrix is the matrix X whose i-th row is the transposed i-th input vector →x⊤:
X=(←→x⊤1→⋮←→x⊤n→)If we note xij the j-th component of vector i:
X=(x11,…,x1d⋮xn1,…,xnd)Using the jargon of statistics: the design matrix is the matrix that contains data on the explanatory variables. Each row represents an individual object.
Design matrix for linear regressions
Design matrices for linear regressions are introduced in this article: Vector notation for linear regressions.
In linear regressions, we often consider that the input vectors have been modified during preprocessing to add a constant component xi0=1. This component is called the bias-term and simplifies the notations for the linear regression. Using this convention, the design matrix has an additional column of ones:
X=(1,x11,…,x1d⋮1,xn1,…,xnd)Thanks to the bias-term, a linear function f→w parametrized by some vector →w of parameters can be expressed in vector notation using a matrix product:
^yi=f→w(→xi)=w0+d∑j=1xij×wj=→x⊤→wIf we write the equation above for each input vectors:
(^y0⋮^yd)⏟ˆ→y=(→x⊤0→w⋮→x⊤d→w)We can use the design matrix as factor:
ˆ→y=X→wThis convention greatly simplifies notations. For instance, we can use the design matrix to express the solution to an ordinary least squares regression easily.