让我们将理论付诸实践。求解线性系统在机器学习中一个很直接的用途是确定线性回归等模型的最佳参数(系数)。我们的目标是找到最适合数据点的直线或超平面。设想一个简单的线性回归情境。我们有一组数据点 $(x_i, y_i)$,其中 $x_i$ 表示特征,$y_i$ 表示目标值。对于具有多个特征的模型,$x_i$ 将是一个向量。我们的目的是找到一组系数,我们称之为 $\theta$(在 $Ax=b$ 形式中通常用 $x$ 表示),使得模型的预测值 $\hat{y} = X\theta$ 尽可能接近实际目标值 $y$。线性回归中的标准方法是最小化预测值与实际值之间的平方误差和。这个最小化问题直接引出一个线性方程组,称为正规方程:$$X^T X \theta = X^T y$$这里:$X$ 是特征矩阵(常被称为设计矩阵)。每行代表一个数据点,每列代表一个特征。通常会在 $X$ 中添加一列全为 1 的值,以考虑截距项。$\theta$ 是我们希望找到的模型系数向量。$y$ 是实际目标值向量。$X^T$ 是特征矩阵 $X$ 的转置。请注意,这个方程具有熟悉的 $A x = b$ 形式,其中 $A = X^T X$,$x = \theta$,而 $b = X^T y$。假设矩阵 $A = X^T X$ 是可逆的,我们可以使用矩阵逆来求解 $\theta$:$$\theta = (X^T X)^{-1} X^T y$$让我们通过一个 NumPy 示例来操作。假设我们有一些数据,例如表示房屋面积和其相应价格的数据。我们想找到最适合这些数据的线性模型 price = intercept + coefficient * square_footage。设置数据首先,我们定义样本数据。我们将使用两个特征(例如,房屋面积和卧室数量)来预测价格,以作为一个稍微复杂些的例子。import numpy as np # 样本数据:[房屋面积,卧室数量] X_features = np.array([ [1500, 3], [2000, 4], [1200, 2], [1800, 3], [2500, 4] ]) # 相应的房价(单位:千美元) y_target = np.array([300, 450, 250, 400, 550]) # 为 X_features 添加一列全为1的值作为截距项 # 这创建了我们的设计矩阵 X X_design = np.hstack([np.ones((X_features.shape[0], 1)), X_features]) print("设计矩阵 X(含截距列):\n", X_design) print("\n目标向量 y:\n", y_target)使用逆矩阵方法计算系数现在,我们应用正规方程公式 $\theta = (X^T X)^{-1} X^T y$。# 计算 X 的转置乘以 X (XTX) XTX = X_design.T @ X_design # 使用 @ 运算符进行矩阵乘法 # 计算 X 的转置乘以 y (XTy) XTy = X_design.T @ y_target print("\nX^T * X:\n", XTX) print("\nX^T * y:\n", XTy) # 计算 XTX 的逆 XTX_inv = np.linalg.inv(XTX) print("\n(X^T * X) 的逆:\n", XTX_inv) # 计算系数 theta theta = XTX_inv @ XTy print("\n使用逆矩阵计算得到的系数 (theta):\n", theta) print(f"截距: {theta[0]:.2f}") print(f"房屋面积的系数: {theta[1]:.2f}") print(f"卧室数量的系数: {theta[2]:.2f}")结果 theta 向量包含截距以及每个特征(房屋面积和卧室数量)的系数。使用 np.linalg.solve() 求解尽管计算逆矩阵可行,但它在数值上通常不如直接求解 $A x = b$ 系统稳定,计算成本也更高。NumPy 提供了 np.linalg.solve(A, b) 函数,它可以不显式计算逆矩阵而求解该系统。这通常是更推荐的方法。让我们使用 np.linalg.solve() 来求解 $(X^T X) \theta = X^T y$。这里,$A = X^T X$ 且 $b = X^T y$。# 直接求解系统 (XTX) * theta = XTy theta_solve = np.linalg.solve(XTX, XTy) print("\n使用 np.linalg.solve() 计算得到的系数 (theta):\n", theta_solve) print(f"截距: {theta_solve[0]:.2f}") print(f"房屋面积的系数: {theta_solve[1]:.2f}") print(f"卧室数量的系数: {theta[2]:.2f}")您会发现,使用 np.linalg.solve() 获得的系数与使用显式逆矩阵方法计算得到的系数几乎相同。然而,对于更大或更复杂的系统,np.linalg.solve() 能提供更好的性能和数值精确度。拟合结果可视化(简化为单特征情况)让我们简化为一个特征(例如,房屋面积)来可视化结果。import numpy as np # 仅使用房屋面积 X_features_simple = np.array([1500, 2000, 1200, 1800, 2500])[:, np.newaxis] # 确保它是一个列向量 y_target_simple = np.array([300, 450, 250, 400, 550]) # 添加截距列 X_design_simple = np.hstack([np.ones((X_features_simple.shape[0], 1)), X_features_simple]) # 使用 np.linalg.solve 求解 XTX_simple = X_design_simple.T @ X_design_simple XTy_simple = X_design_simple.T @ y_target_simple theta_simple = np.linalg.solve(XTX_simple, XTy_simple) print("\n简单模型的系数(截距,房屋面积系数):\n", theta_simple) # 生成回归线的点 x_line = np.linspace(1100, 2600, 100) # 覆盖我们数据的范围 y_line = theta_simple[0] + theta_simple[1] * x_line # 创建 Plotly 图表数据 plot_data = { "layout": { "title": "Linear Regression Fit (Price vs. Square Footage)", "xaxis": {"title": "Square Footage"}, "yaxis": {"title": "Price ($1000s)"}, "hovermode": "closest", "template": "plotly_white" # 更适合网页的简洁外观 }, "data": [ { "type": "scatter", "mode": "markers", "x": X_features_simple.flatten().tolist(), "y": y_target_simple.tolist(), "name": "Data Points", "marker": {"color": "#228be6", "size": 10} # 蓝色 }, { "type": "scatter", "mode": "lines", "x": x_line.tolist(), "y": y_line.tolist(), "name": "Regression Line", "line": {"color": "#f03e3e", "width": 3} # 红色 } ] }{ "data": [ { "x": [1500, 2000, 1200, 1800, 2500], "y": [300, 450, 250, 400, 550], "type": "scatter", "mode": "markers", "name": "数据点", "marker": {"color": "#228be6", "size": 10} }, { "x": [1100.0, 1115.151515151515, 1130.3030303030303, 1145.4545454545455, 1160.6060606060605, 1175.7575757575758, 1190.909090909091, 1206.060606060606, 1221.2121212121212, 1236.3636363636363, 1251.5151515151515, 1266.6666666666667, 1281.8181818181818, 1296.969696969697, 1312.1212121212122, 1327.2727272727273, 1342.4242424242424, 1357.5757575757576, 1372.7272727272727, 1387.8787878787878, 1403.030303030303, 1418.1818181818182, 1433.3333333333333, 1448.4848484848485, 1463.6363636363637, 1478.7878787878788, 1493.939393939394, 1509.090909090909, 1524.2424242424242, 1539.3939393939393, 1554.5454545454545, 1569.6969696969697, 1584.8484848484848, 1600.0, 1615.1515151515152, 1630.3030303030303, 1645.4545454545455, 1660.6060606060605, 1675.7575757575758, 1690.909090909091, 1706.060606060606, 1721.2121212121212, 1736.3636363636363, 1751.5151515151515, 1766.6666666666667, 1781.818181818182, 1796.969696969697, 1812.1212121212122, 1827.2727272727273, 1842.4242424242424, 1857.5757575757576, 1872.7272727272727, 1887.8787878787878, 1903.030303030303, 1918.1818181818182, 1933.3333333333333, 1948.4848484848485, 1963.6363636363637, 1978.7878787878788, 1993.939393939394, 2009.090909090909, 2024.2424242424242, 2039.3939393939393, 2054.5454545454545, 2069.6969696969697, 2084.8484848484848, 2100.0, 2115.1515151515152, 2130.3030303030303, 2145.4545454545455, 2160.6060606060605, 2175.7575757575758, 2190.909090909091, 2206.060606060606, 2221.2121212121212, 2236.3636363636363, 2251.5151515151515, 2266.6666666666665, 2281.818181818182, 2296.969696969697, 2312.121212121212, 2327.2727272727275, 2342.4242424242424, 2357.5757575757576, 2372.727272727273, 2387.878787878788, 2403.030303030303, 2418.181818181818, 2433.3333333333335, 2448.4848484848485, 2463.6363636363635, 2478.787878787879, 2493.939393939394, 2509.090909090909, 2524.242424242424, 2539.393939393939, 2554.5454545454545, 2569.69696969697, 2584.848484848485, 2600.0], "y": [234.87730061349695, 238.60490797546013, 242.33251533742332, 246.0601226993865, 249.7877300613497, 253.51533742331288, 257.2429447852761, 260.9705521472393, 264.69815950920245, 268.42576687116564, 272.15337423312883, 275.88098159509203, 279.6085889570552, 283.3361963190184, 287.0638036809816, 290.7914110429448, 294.519018404908, 298.2466257668712, 301.97423312883436, 305.70184049079755, 309.42944785276074, 313.15705521472393, 316.8846625766871, 320.6122699386503, 324.3398773006135, 328.0674846625767, 331.7950920245399, 335.5227, 339.25030674846625, 342.97791411042944, 346.70552147239264, 350.43312883435583, 354.160736196319, 357.8883435582822, 361.6159509202454, 365.3435582822086, 369.07116564417177, 372.79877300613497, 376.52638036809816, 380.25398773006135, 383.98159509202454, 387.70920245398774, 391.4368098159509, 395.1644171779141, 398.8920245398773, 402.6196319018405, 406.3472392638037, 410.07484662576685, 413.80245398773005, 417.53006134969324, 421.25766871165644, 424.98527607361963, 428.7128834355828, 432.440490797546, 436.1680981595092, 439.8957055214724, 443.6233128834356, 447.3509202453988, 451.078527607362, 454.80613496932514, 458.53374233128833, 462.2613496932515, 465.9889570552147, 469.7165644171779, 473.4441717791411, 477.1717791411043, 480.8993865030675, 484.6269938650307, 488.35460122699386, 492.08220858895706, 495.80981595092025, 499.53742331288344, 503.26503067484663, 506.9926380368098, 510.720245398773, 514.4478527607362, 518.1754601226994, 521.9030674846626, 525.6306748466257, 529.3582822085889, 533.0858895705521, 536.8134969325153, 540.5411042944785, 544.2687116564417, 547.9963190184049, 551.7239263803681, 555.4515337423313, 559.1791411042944, 562.9067484662576, 566.6343558282208, 570.361963190184, 574.0895705521472, 577.8171779141104, 581.5447852760736, 585.2723926380368, 589.0, 592.7276073619632, 596.4552147239264], "type": "scatter", "mode": "lines", "name": "回归线", "line": {"color": "#f03e3e", "width": 3} } ], "layout": { "title": "线性回归拟合(价格 vs. 房屋面积)", "xaxis": {"title": "房屋面积"}, "yaxis": {"title": "价格(千美元)"}, "hovermode": "closest", "template": "plotly_white" } }使用通过求解正规方程得到的系数拟合样本数据的线性回归线。本动手实践部分说明了求解线性方程组是寻找机器学习模型参数的实用工具。通过将问题表示为矩阵形式($Ax=b$,或更具体地说,$(X^T X) \theta = X^T y$),我们可以运用 NumPy 等高效数值库来找到最佳系数,这可以通过矩阵求逆实现,或者更推荐使用像 np.linalg.solve() 这样的直接求解器。