在传统供应链中,终端需求5%的波动会在上游被放大到40%的库存偏差(MIT Beer Game实验数据)。需求预测误差每降低1%,库存周转率可提升7-10%,缺货率下降3-5%。
维度 | LightGBM | Temporal Fusion Transformer (TFT) |
|---|---|---|
数据量 | 1万-100万样本 | 100万+样本 |
特征工程 | 需手工构造 | 自动提取时序特征 |
解释性 | SHAP值 | 内置注意力可视化 |
冷启动问题 | 可迁移学习 | 需预训练 |
传统方法仅用历史销量,AI引入:
# 处理促销期间的异常峰值
def winsorize_promo(df, promo_col='is_promo', sales_col='sales'):
promo_mask = df[promo_col] == 1
q99 = df[promo_mask][sales_col].quantile(0.99)
df.loc[promo_mask & (df[sales_col] > q99), sales_col] = q99
return df
# 动态缺失值填充:相似SKU的同期均值
def dynamic_impute(df, sku_col='sku', date_col='date'):
df['imputed'] = df.groupby([sku_col, df[date_col].dt.month])[sales_col].transform(
lambda x: x.fillna(x.median())
)
return dffrom tsfresh import extract_features
from tsfresh.feature_selection.relevance import calculate_relevance_table
# 自动提取300+时序特征
features = extract_features(
df,
column_id='sku',
column_sort='date',
default_fc_parameters=EfficientFCParameters()
)
# 基于假设检验筛选显著特征
relevance = calculate_relevance_table(features, df['demand'])
selected_features = relevance[relevance['p_value'] < 0.01]['feature'].tolist()import lightgbm as lgb
from sklearn.metrics import mean_absolute_percentage_error as mape
# 构造滞后特征(过去7/14/30天销量)
def create_lags(df, lags=[7,14,30]):
for lag in lags:
df[f'sales_lag_{lag}'] = df.groupby('sku')['sales'].shift(lag)
return df
# 自定义损失函数:加权MAPE(近期误差权重更高)
def weighted_mape(preds, train_data):
labels = train_data.get_label()
weights = np.power(0.9, np.arange(len(labels))[::-1]) # 指数衰减权重
return 'weighted_mape', mape(labels, preds, sample_weight=weights), False
# 训练
train_data = lgb.Dataset(X_train, label=y_train)
model = lgb.train(
params={
'objective': 'regression_l1',
'learning_rate': 0.03,
'max_depth': 8,
'num_leaves': 64,
'feature_fraction': 0.8,
'bagging_fraction': 0.8,
'bagging_freq': 1
},
train_set=train_data,
feval=weighted_mape,
num_boost_round=1000,
early_stopping_rounds=100
)from pytorch_forecasting import TemporalFusionTransformer, Baseline
# 定义时间序列数据集
dataset = TimeSeriesDataSet(
data=df,
time_idx="time_idx",
target="demand",
group_ids=["sku", "warehouse"],
min_encoder_length=30,
max_encoder_length=60,
min_prediction_length=7,
max_prediction_length=14,
static_categoricals=["category", "region"],
time_varying_known_reals=["discount", "temperature"],
time_varying_unknown_reals=["demand"],
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,
)
# 训练
tft = TemporalFusionTransformer.from_dataset(
dataset,
learning_rate=0.01,
hidden_size=32,
attention_head_size=4,
dropout=0.1,
loss=QuantileLoss(),
)
trainer.fit(
tft,
train_dataloaders=train_dataloader,
val_dataloaders=val_dataloader,
)
# 注意力可视化:识别关键驱动因子
interpretation = tft.interpret_output(batch, reduction="sum")
tft.plot_interpretation(interpretation)import shap
# 计算全局特征重要性
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
# 发现:当折扣>30%且温度<5℃时,促销反而抑制需求(shap值<-0.3)
shap.dependence_plot(
"discount",
shap_values,
X_test,
interaction_index="temperature"
)TFT的静态变量选择权重显示:
from kafka import KafkaConsumer
import faust
# 实时温度数据接入
app = faust.App('feature-pipeline', broker='kafka://localhost:9092')
temperature_topic = app.topic('weather', value_type=float)
@app.agent(temperature_topic)
async def process_temperature(temperatures):
async for temp in temperatures:
redis_client.set(f"temp_{datetime.now().date()}", temp)将LightGBM模型转换为ONNX格式,部署在门店边缘设备:
import onnxmltools
onnx_model = onnxmltools.convert_lightgbm(model)
onnxmltools.utils.save_model(onnx_model, 'demand_predict.onnx')使用Kolmogorov-Smirnov检验监控特征分布:
from scipy.stats import ks_2samp
# 检测温度特征漂移
_, p_value = ks_2samp(
reference_data['temperature'],
current_data['temperature']
)
if p_value < 0.05:
trigger_model_retrain()原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。