I'm attempting to incorporate various transformations into a scikit-learn pipeline along with a LightGBM model. This model aims to predict the prices of second-hand vehicles. Once trained, I plan to integrate this model into an HTML page for practical use.
from sklearn.preprocessing import StandardScaler, LabelEncoderfrom sklearn.pipeline import Pipelinefrom sklearn.compose import ColumnTransformerimport joblibprint(numeric_features)`['car_year', 'km', 'horse_power', 'cyl_capacity']`print(categorical_features)`['make', 'model', 'trimlevel', 'fueltype', 'transmission', 'bodytype', 'color']`# Define transformers for numeric and categorical featuresnumeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])categorical_transformer = Pipeline(steps=[('labelencoder', LabelEncoder())])# Combine transformers using ColumnTransformerpreprocessor = ColumnTransformer( transformers=[ ('num', numeric_transformer, numeric_features), ('cat', categorical_transformer, categorical_features) ])# Append the LightGBM model to the preprocessing pipelinepipeline = Pipeline(steps=[ ('preprocessor', preprocessor), ('model', best_lgb_model)])# Fit the pipeline to training datapipeline.fit(X_train, y_train)The output I get when training is:
LabelEncoder.fit_transform() takes 2 positional arguments but 3 were given