Text Classification with AutoGluon

27. Text Classification with AutoGluon#

https://auto.gluon.ai/stable/index.html

This is an excellent library from AWS that may be used for multimodal machine learning in an automatic manner. It uses stack-ensembling and beats most kaggle competition winners. See the papers in the Guthub repo: awslabs/autogluon

from google.colab import drive
drive.mount('/content/drive')  # Add My Drive/<>

import os
os.chdir('drive/My Drive')
os.chdir('Books_Writings/NLPBook/')

Mounted at /content/drive

%%capture
# %pylab inline
import pandas as pd
import os
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

27.1. Use AutoGluon Tabular on News Dataset#

We need to first install Meta’s PyTorch framework and then install AutoGluon, which runs on top of PyTorch. This is an extensive installation, and will take some time.

AutoGluon installation instructions: https://auto.gluon.ai/stable/install.html

%%time
!pip install -U pip
!pip install -U setuptools wheel
# !pip install -U uv

# CPU version of pytorch has smaller footprint - see installation instructions in
# pytorch documentation - https://pytorch.org/get-started/locally/
# !uv pip install torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cpu --system

# !uv pip install autogluon --system
# !pip install autogluon
!pip install autogluon --extra-index-url https://download.pytorch.org/whl/cpu

Requirement already satisfied: pip in /usr/local/lib/python3.11/dist-packages (24.1.2)
Collecting pip
  Downloading pip-25.0.1-py3-none-any.whl.metadata (3.7 kB)
Downloading pip-25.0.1-py3-none-any.whl (1.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 29.2 MB/s eta 0:00:00
?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.1.2
    Uninstalling pip-24.1.2:
      Successfully uninstalled pip-24.1.2
Successfully installed pip-25.0.1
Requirement already satisfied: setuptools in /usr/local/lib/python3.11/dist-packages (75.2.0)
Collecting setuptools
  Downloading setuptools-78.1.0-py3-none-any.whl.metadata (6.6 kB)
Requirement already satisfied: wheel in /usr/local/lib/python3.11/dist-packages (0.45.1)
Downloading setuptools-78.1.0-py3-none-any.whl (1.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 36.2 MB/s eta 0:00:00
?25hInstalling collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 75.2.0
    Uninstalling setuptools-75.2.0:
      Successfully uninstalled setuptools-75.2.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.34.0 requires jedi>=0.16, which is not installed.
Successfully installed setuptools-78.1.0

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu
Collecting autogluon
  Downloading autogluon-1.2-py3-none-any.whl.metadata (11 kB)
Collecting autogluon.core==1.2 (from autogluon.core[all]==1.2->autogluon)
  Downloading autogluon.core-1.2-py3-none-any.whl.metadata (12 kB)
Collecting autogluon.features==1.2 (from autogluon)
  Downloading autogluon.features-1.2-py3-none-any.whl.metadata (11 kB)
Collecting autogluon.tabular==1.2 (from autogluon.tabular[all]==1.2->autogluon)
  Downloading autogluon.tabular-1.2-py3-none-any.whl.metadata (14 kB)
Collecting autogluon.multimodal==1.2 (from autogluon)
  Downloading autogluon.multimodal-1.2-py3-none-any.whl.metadata (12 kB)
Collecting autogluon.timeseries==1.2 (from autogluon.timeseries[all]==1.2->autogluon)
  Downloading autogluon.timeseries-1.2-py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: numpy<2.1.4,>=1.25.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2.0.2)
Requirement already satisfied: scipy<1.16,>=1.5.4 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (1.14.1)
Collecting scikit-learn<1.5.3,>=1.4.0 (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Requirement already satisfied: networkx<4,>=3.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.4.2)
Requirement already satisfied: pandas<2.3.0,>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2.2.2)
Requirement already satisfied: tqdm<5,>=4.38 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (4.67.1)
Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2.32.3)
Requirement already satisfied: matplotlib<3.11,>=3.7.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.10.0)
Collecting boto3<2,>=1.10 (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading boto3-1.37.31-py3-none-any.whl.metadata (6.7 kB)
Collecting autogluon.common==1.2 (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading autogluon.common-1.2-py3-none-any.whl.metadata (11 kB)
Collecting ray<2.40,>=2.10.0 (from ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading ray-2.39.0-cp311-cp311-manylinux2014_x86_64.whl.metadata (17 kB)
Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.core[all]==1.2->autogluon) (18.1.0)
Requirement already satisfied: hyperopt<0.2.8,>=0.2.7 in /usr/local/lib/python3.11/dist-packages (from autogluon.core[all]==1.2->autogluon) (0.2.7)
Requirement already satisfied: Pillow<12,>=10.0.1 in /usr/local/lib/python3.11/dist-packages (from autogluon.multimodal==1.2->autogluon) (11.1.0)
Collecting torch<2.6,>=2.2 (from autogluon.multimodal==1.2->autogluon)
  Downloading https://download.pytorch.org/whl/cpu/torch-2.5.1%2Bcpu-cp311-cp311-linux_x86_64.whl (174.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 174.7/174.7 MB 61.8 MB/s eta 0:00:00
?25hCollecting lightning<2.6,>=2.2 (from autogluon.multimodal==1.2->autogluon)
  Downloading lightning-2.5.1-py3-none-any.whl.metadata (39 kB)
Requirement already satisfied: transformers<5,>=4.38.0 in /usr/local/lib/python3.11/dist-packages (from transformers[sentencepiece]<5,>=4.38.0->autogluon.multimodal==1.2->autogluon) (4.50.3)
Collecting accelerate<1.0,>=0.34.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading accelerate-0.34.2-py3-none-any.whl.metadata (19 kB)
Collecting jsonschema<4.22,>=4.18 (from autogluon.multimodal==1.2->autogluon)
  Downloading jsonschema-4.21.1-py3-none-any.whl.metadata (7.8 kB)
Collecting seqeval<1.3.0,>=1.2.2 (from autogluon.multimodal==1.2->autogluon)
  Downloading seqeval-1.2.2.tar.gz (43 kB)
  Preparing metadata (setup.py) ... ?25l?25hdone
Collecting evaluate<0.5.0,>=0.4.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading evaluate-0.4.3-py3-none-any.whl.metadata (9.2 kB)
Collecting timm<1.0.7,>=0.9.5 (from autogluon.multimodal==1.2->autogluon)
  Downloading timm-1.0.3-py3-none-any.whl.metadata (43 kB)
Collecting torchvision<0.21.0,>=0.16.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading https://download.pytorch.org/whl/cpu/torchvision-0.20.1%2Bcpu-cp311-cp311-linux_x86_64.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 73.4 MB/s eta 0:00:00
?25hCollecting scikit-image<0.25.0,>=0.19.1 (from autogluon.multimodal==1.2->autogluon)
  Downloading scikit_image-0.24.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (14 kB)
Requirement already satisfied: text-unidecode<1.4,>=1.3 in /usr/local/lib/python3.11/dist-packages (from autogluon.multimodal==1.2->autogluon) (1.3)
Collecting torchmetrics<1.3.0,>=1.2.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading torchmetrics-1.2.1-py3-none-any.whl.metadata (20 kB)
Collecting omegaconf<2.3.0,>=2.1.1 (from autogluon.multimodal==1.2->autogluon)
  Downloading omegaconf-2.2.3-py3-none-any.whl.metadata (3.9 kB)
Collecting pytorch-metric-learning<2.4,>=1.3.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading pytorch_metric_learning-2.3.0-py3-none-any.whl.metadata (17 kB)
Collecting nlpaug<1.2.0,>=1.1.10 (from autogluon.multimodal==1.2->autogluon)
  Downloading nlpaug-1.1.11-py3-none-any.whl.metadata (14 kB)
Collecting nltk<3.9,>=3.4.5 (from autogluon.multimodal==1.2->autogluon)
  Downloading nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB)
Collecting openmim<0.4.0,>=0.3.7 (from autogluon.multimodal==1.2->autogluon)
  Downloading openmim-0.3.9-py2.py3-none-any.whl.metadata (16 kB)
Requirement already satisfied: defusedxml<0.7.2,>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from autogluon.multimodal==1.2->autogluon) (0.7.1)
Requirement already satisfied: jinja2<3.2,>=3.0.3 in /usr/local/lib/python3.11/dist-packages (from autogluon.multimodal==1.2->autogluon) (3.1.6)
Requirement already satisfied: tensorboard<3,>=2.9 in /usr/local/lib/python3.11/dist-packages (from autogluon.multimodal==1.2->autogluon) (2.18.0)
Collecting pytesseract<0.3.11,>=0.3.9 (from autogluon.multimodal==1.2->autogluon)
  Downloading pytesseract-0.3.10-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-ml-py3==7.352.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading nvidia-ml-py3-7.352.0.tar.gz (19 kB)
  Preparing metadata (setup.py) ... ?25l?25hdone
Collecting pdf2image<1.19,>=1.17.0 (from autogluon.multimodal==1.2->autogluon)
  Downloading pdf2image-1.17.0-py3-none-any.whl.metadata (6.2 kB)
Collecting catboost<1.3,>=1.2 (from autogluon.tabular[all]==1.2->autogluon)
  Downloading catboost-1.2.7-cp311-cp311-manylinux2014_x86_64.whl.metadata (1.2 kB)
Collecting numpy<2.1.4,>=1.25.0 (from autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Collecting spacy<3.8 (from autogluon.tabular[all]==1.2->autogluon)
  Downloading spacy-3.7.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (27 kB)
Requirement already satisfied: lightgbm<4.6,>=4.0 in /usr/local/lib/python3.11/dist-packages (from autogluon.tabular[all]==1.2->autogluon) (4.5.0)
Requirement already satisfied: einops<0.9,>=0.7 in /usr/local/lib/python3.11/dist-packages (from autogluon.tabular[all]==1.2->autogluon) (0.8.1)
Requirement already satisfied: xgboost<2.2,>=1.6 in /usr/local/lib/python3.11/dist-packages (from autogluon.tabular[all]==1.2->autogluon) (2.1.4)
Requirement already satisfied: fastai<2.8,>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from autogluon.tabular[all]==1.2->autogluon) (2.7.19)
Requirement already satisfied: huggingface-hub[torch] in /usr/local/lib/python3.11/dist-packages (from autogluon.tabular[all]==1.2->autogluon) (0.30.1)
Requirement already satisfied: joblib<2,>=1.1 in /usr/local/lib/python3.11/dist-packages (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (1.4.2)
Collecting pytorch-lightning (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading pytorch_lightning-2.5.1-py3-none-any.whl.metadata (20 kB)
Collecting gluonts<0.17,>=0.15.0 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading gluonts-0.16.1-py3-none-any.whl.metadata (9.8 kB)
Collecting statsforecast<1.8,>=1.7.0 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading statsforecast-1.7.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (28 kB)
Collecting mlforecast==0.13.4 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading mlforecast-0.13.4-py3-none-any.whl.metadata (12 kB)
Collecting utilsforecast<0.2.5,>=0.2.3 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading utilsforecast-0.2.4-py3-none-any.whl.metadata (7.4 kB)
Collecting coreforecast==0.0.12 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading coreforecast-0.0.12-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting fugue>=0.9.0 (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading fugue-0.9.1-py3-none-any.whl.metadata (18 kB)
Requirement already satisfied: orjson~=3.9 in /usr/local/lib/python3.11/dist-packages (from autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (3.10.16)
Requirement already satisfied: psutil<7.0.0,>=5.7.3 in /usr/local/lib/python3.11/dist-packages (from autogluon.common==1.2->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (5.9.5)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.11/dist-packages (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (3.1.1)
Requirement already satisfied: fsspec in /usr/local/lib/python3.11/dist-packages (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (2025.3.2)
Requirement already satisfied: numba in /usr/local/lib/python3.11/dist-packages (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.60.0)
Collecting optuna (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading optuna-4.2.1-py3-none-any.whl.metadata (17 kB)
Requirement already satisfied: packaging in /usr/local/lib/python3.11/dist-packages (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (24.2)
Collecting window-ops (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading window_ops-0.0.15-py3-none-any.whl.metadata (6.8 kB)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.11/dist-packages (from accelerate<1.0,>=0.34.0->autogluon.multimodal==1.2->autogluon) (6.0.2)
Requirement already satisfied: safetensors>=0.4.3 in /usr/local/lib/python3.11/dist-packages (from accelerate<1.0,>=0.34.0->autogluon.multimodal==1.2->autogluon) (0.5.3)
Collecting botocore<1.38.0,>=1.37.31 (from boto3<2,>=1.10->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading botocore-1.37.31-py3-none-any.whl.metadata (5.7 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3<2,>=1.10->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB)
Collecting s3transfer<0.12.0,>=0.11.0 (from boto3<2,>=1.10->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading s3transfer-0.11.4-py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: graphviz in /usr/local/lib/python3.11/dist-packages (from catboost<1.3,>=1.2->autogluon.tabular[all]==1.2->autogluon) (0.20.3)
Requirement already satisfied: plotly in /usr/local/lib/python3.11/dist-packages (from catboost<1.3,>=1.2->autogluon.tabular[all]==1.2->autogluon) (5.24.1)
Requirement already satisfied: six in /usr/local/lib/python3.11/dist-packages (from catboost<1.3,>=1.2->autogluon.tabular[all]==1.2->autogluon) (1.17.0)
Collecting datasets>=2.0.0 (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading datasets-3.5.0-py3-none-any.whl.metadata (19 kB)
Collecting dill (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading dill-0.3.9-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading multiprocess-0.70.17-py311-none-any.whl.metadata (7.2 kB)
Requirement already satisfied: pip in /usr/local/lib/python3.11/dist-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.2->autogluon) (25.0.1)
Requirement already satisfied: fastdownload<2,>=0.0.5 in /usr/local/lib/python3.11/dist-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.2->autogluon) (0.0.7)
Requirement already satisfied: fastcore<1.8,>=1.5.29 in /usr/local/lib/python3.11/dist-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.2->autogluon) (1.7.29)
Requirement already satisfied: fastprogress>=0.2.4 in /usr/local/lib/python3.11/dist-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.2->autogluon) (1.0.3)
Collecting triad>=0.9.7 (from fugue>=0.9.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading triad-0.9.8-py3-none-any.whl.metadata (6.3 kB)
Collecting adagio>=0.2.4 (from fugue>=0.9.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading adagio-0.2.6-py3-none-any.whl.metadata (1.8 kB)
Requirement already satisfied: pydantic<3,>=1.7 in /usr/local/lib/python3.11/dist-packages (from gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (2.11.2)
Requirement already satisfied: toolz~=0.10 in /usr/local/lib/python3.11/dist-packages (from gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.12.1)
Requirement already satisfied: typing-extensions~=4.0 in /usr/local/lib/python3.11/dist-packages (from gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (4.13.1)
Requirement already satisfied: future in /usr/local/lib/python3.11/dist-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==1.2->autogluon) (1.0.0)
Requirement already satisfied: py4j in /usr/local/lib/python3.11/dist-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==1.2->autogluon) (0.10.9.7)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.11/dist-packages (from jinja2<3.2,>=3.0.3->autogluon.multimodal==1.2->autogluon) (3.0.2)
Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.11/dist-packages (from jsonschema<4.22,>=4.18->autogluon.multimodal==1.2->autogluon) (25.3.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.11/dist-packages (from jsonschema<4.22,>=4.18->autogluon.multimodal==1.2->autogluon) (2024.10.1)
Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.11/dist-packages (from jsonschema<4.22,>=4.18->autogluon.multimodal==1.2->autogluon) (0.36.2)
Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from jsonschema<4.22,>=4.18->autogluon.multimodal==1.2->autogluon) (0.24.0)
Collecting lightning-utilities<2.0,>=0.10.0 (from lightning<2.6,>=2.2->autogluon.multimodal==1.2->autogluon)
  Downloading lightning_utilities-0.14.3-py3-none-any.whl.metadata (5.6 kB)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (4.57.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (1.4.8)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.2.3)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.11/dist-packages (from matplotlib<3.11,>=3.7.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2.8.2)
Requirement already satisfied: gdown>=4.0.0 in /usr/local/lib/python3.11/dist-packages (from nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.2->autogluon) (5.2.0)
Requirement already satisfied: click in /usr/local/lib/python3.11/dist-packages (from nltk<3.9,>=3.4.5->autogluon.multimodal==1.2->autogluon) (8.1.8)
Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.11/dist-packages (from nltk<3.9,>=3.4.5->autogluon.multimodal==1.2->autogluon) (2024.11.6)
Collecting antlr4-python3-runtime==4.9.* (from omegaconf<2.3.0,>=2.1.1->autogluon.multimodal==1.2->autogluon)
  Downloading antlr4-python3-runtime-4.9.3.tar.gz (117 kB)
  Preparing metadata (setup.py) ... ?25l?25hdone
Collecting colorama (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading https://download.pytorch.org/whl/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting model-index (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading model_index-0.1.11-py3-none-any.whl.metadata (3.9 kB)
Collecting opendatalab (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading opendatalab-0.0.10-py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: rich in /usr/local/lib/python3.11/dist-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon) (13.9.4)
Requirement already satisfied: tabulate in /usr/local/lib/python3.11/dist-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon) (0.9.0)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.11/dist-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.11/dist-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2025.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.11/dist-packages (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (3.18.0)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.1.0)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /usr/local/lib/python3.11/dist-packages (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (5.29.4)
Requirement already satisfied: aiosignal in /usr/local/lib/python3.11/dist-packages (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.3.2)
Requirement already satisfied: frozenlist in /usr/local/lib/python3.11/dist-packages (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.5.0)
Requirement already satisfied: aiohttp>=3.7 in /usr/local/lib/python3.11/dist-packages (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (3.11.15)
Collecting aiohttp-cors (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading aiohttp_cors-0.8.1-py3-none-any.whl.metadata (20 kB)
Collecting colorful (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading colorful-0.5.6-py2.py3-none-any.whl.metadata (16 kB)
Collecting py-spy>=0.2.0 (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading py_spy-0.4.0-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (16 kB)
Collecting opencensus (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading opencensus-0.11.4-py2.py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/local/lib/python3.11/dist-packages (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (0.21.1)
Requirement already satisfied: smart-open in /usr/local/lib/python3.11/dist-packages (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (7.1.0)
Collecting virtualenv!=20.21.1,>=20.0.24 (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading virtualenv-20.30.0-py3-none-any.whl.metadata (4.5 kB)
Requirement already satisfied: grpcio>=1.42.0 in /usr/local/lib/python3.11/dist-packages (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.71.0)
Collecting memray (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading memray-1.17.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting tensorboardX>=1.9 (from ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl.metadata (5.8 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.11/dist-packages (from requests->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.11/dist-packages (from requests->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.11/dist-packages (from requests->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (2025.1.31)
Requirement already satisfied: imageio>=2.33 in /usr/local/lib/python3.11/dist-packages (from scikit-image<0.25.0,>=0.19.1->autogluon.multimodal==1.2->autogluon) (2.37.0)
Requirement already satisfied: tifffile>=2022.8.12 in /usr/local/lib/python3.11/dist-packages (from scikit-image<0.25.0,>=0.19.1->autogluon.multimodal==1.2->autogluon) (2025.3.30)
Requirement already satisfied: lazy-loader>=0.4 in /usr/local/lib/python3.11/dist-packages (from scikit-image<0.25.0,>=0.19.1->autogluon.multimodal==1.2->autogluon) (0.4)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn<1.5.3,>=1.4.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon) (3.6.0)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.0.12)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (2.0.11)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (3.0.9)
Collecting thinc<8.3.0,>=8.2.2 (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon)
  Downloading thinc-8.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (15 kB)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.1.3)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (2.5.1)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (2.0.10)
Requirement already satisfied: weasel<0.5.0,>=0.1.0 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (0.4.1)
Requirement already satisfied: typer<1.0.0,>=0.3.0 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (0.15.2)
Requirement already satisfied: setuptools in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (78.1.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.11/dist-packages (from spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (3.5.0)
Requirement already satisfied: statsmodels>=0.13.2 in /usr/local/lib/python3.11/dist-packages (from statsforecast<1.8,>=1.7.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.14.4)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.11/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.2->autogluon) (1.4.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.11/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.2->autogluon) (3.7)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.2->autogluon) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.2->autogluon) (3.1.3)
Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.11/dist-packages (from torch<2.6,>=2.2->autogluon.multimodal==1.2->autogluon) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from sympy==1.13.1->torch<2.6,>=2.2->autogluon.multimodal==1.2->autogluon) (1.3.0)
Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.11/dist-packages (from transformers<5,>=4.38.0->transformers[sentencepiece]<5,>=4.38.0->autogluon.multimodal==1.2->autogluon) (0.21.1)
Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /usr/local/lib/python3.11/dist-packages (from transformers[sentencepiece]<5,>=4.38.0->autogluon.multimodal==1.2->autogluon) (0.2.0)
Requirement already satisfied: nvidia-nccl-cu12 in /usr/local/lib/python3.11/dist-packages (from xgboost<2.2,>=1.6->autogluon.tabular[all]==1.2->autogluon) (2.21.5)
Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (2.6.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.11/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (6.2.0)
Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (0.3.1)
Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.11/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.18.3)
Collecting dill (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting multiprocess (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.2->autogluon)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec (from mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading fsspec-2024.12.0-py3-none-any.whl.metadata (11 kB)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.11/dist-packages (from gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.2->autogluon) (4.13.3)
Requirement already satisfied: language-data>=1.2 in /usr/local/lib/python3.11/dist-packages (from langcodes<4.0.0,>=3.2.0->spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.3.0)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /usr/local/lib/python3.11/dist-packages (from numba->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.43.0)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.11/dist-packages (from pydantic<3,>=1.7->gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.1 in /usr/local/lib/python3.11/dist-packages (from pydantic<3,>=1.7->gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (2.33.1)
Requirement already satisfied: typing-inspection>=0.4.0 in /usr/local/lib/python3.11/dist-packages (from pydantic<3,>=1.7->gluonts<0.17,>=0.15.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (0.4.0)
Requirement already satisfied: patsy>=0.5.6 in /usr/local/lib/python3.11/dist-packages (from statsmodels>=0.13.2->statsforecast<1.8,>=1.7.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (1.0.1)
Collecting blis<0.8.0,>=0.7.8 (from thinc<8.3.0,>=8.2.2->spacy<3.8->autogluon.tabular[all]==1.2->autogluon)
  Downloading blis-0.7.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.4 kB)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.11/dist-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (0.1.5)
Collecting fs (from triad>=0.9.7->fugue>=0.9.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading fs-2.4.16-py2.py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.11/dist-packages (from typer<1.0.0,>=0.3.0->spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.5.4)
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.11/dist-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.11/dist-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon) (2.18.0)
Collecting distlib<1,>=0.3.7 (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading distlib-0.3.9-py2.py3-none-any.whl.metadata (5.2 kB)
Requirement already satisfied: platformdirs<5,>=3.9.1 in /usr/local/lib/python3.11/dist-packages (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (4.3.7)
Requirement already satisfied: cloudpathlib<1.0.0,>=0.7.0 in /usr/local/lib/python3.11/dist-packages (from weasel<0.5.0,>=0.1.0->spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (0.21.0)
Requirement already satisfied: wrapt in /usr/local/lib/python3.11/dist-packages (from smart-open->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.17.2)
Collecting textual>=0.41.0 (from memray->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading textual-3.0.1-py3-none-any.whl.metadata (9.0 kB)
Collecting ordered-set (from model-index->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading ordered_set-4.1.0-py3-none-any.whl.metadata (5.3 kB)
Collecting opencensus-context>=0.1.3 (from opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading opencensus_context-0.1.3-py2.py3-none-any.whl.metadata (3.3 kB)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /usr/local/lib/python3.11/dist-packages (from opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (2.24.2)
Collecting pycryptodome (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading pycryptodome-3.22.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting openxlab (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading openxlab-0.1.2-py3-none-any.whl.metadata (3.8 kB)
Collecting alembic>=1.5.0 (from optuna->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading alembic-1.15.2-py3-none-any.whl.metadata (7.3 kB)
Collecting colorlog (from optuna->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Requirement already satisfied: sqlalchemy>=1.4.2 in /usr/local/lib/python3.11/dist-packages (from optuna->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (2.0.40)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.11/dist-packages (from plotly->catboost<1.3,>=1.2->autogluon.tabular[all]==1.2->autogluon) (9.1.2)
Requirement already satisfied: Mako in /usr/lib/python3/dist-packages (from alembic>=1.5.0->optuna->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (1.1.3)
Requirement already satisfied: googleapis-common-protos<2.0.0,>=1.56.2 in /usr/local/lib/python3.11/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.69.2)
Requirement already satisfied: proto-plus<2.0.0,>=1.22.3 in /usr/local/lib/python3.11/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.26.1)
Requirement already satisfied: google-auth<3.0.0,>=2.14.1 in /usr/local/lib/python3.11/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (2.38.0)
Requirement already satisfied: marisa-trie>=1.1.0 in /usr/local/lib/python3.11/dist-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<3.8->autogluon.tabular[all]==1.2->autogluon) (1.2.1)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py>=2.2.0->rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon) (0.1.2)
Requirement already satisfied: greenlet>=1 in /usr/local/lib/python3.11/dist-packages (from sqlalchemy>=1.4.2->optuna->mlforecast==0.13.4->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon) (3.1.1)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.11/dist-packages (from beautifulsoup4->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.2->autogluon) (2.6)
Collecting appdirs~=1.4.3 (from fs->triad>=0.9.7->fugue>=0.9.0->autogluon.timeseries==1.2->autogluon.timeseries[all]==1.2->autogluon)
  Downloading appdirs-1.4.4-py2.py3-none-any.whl.metadata (9.0 kB)
Collecting filelock (from ray<2.40,>=2.10.0->ray[default]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon)
  Downloading filelock-3.14.0-py3-none-any.whl.metadata (2.8 kB)
Collecting oss2~=2.17.0 (from openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading oss2-2.17.0.tar.gz (259 kB)
  Preparing metadata (setup.py) ... ?25l?25hdone
Collecting pytz>=2020.1 (from pandas<2.3.0,>=2.0.0->autogluon.core==1.2->autogluon.core[all]==1.2->autogluon)
  Downloading pytz-2023.4-py2.py3-none-any.whl.metadata (22 kB)
INFO: pip is looking at multiple versions of openxlab to determine which version is compatible with other requirements. This could take a while.
Collecting openxlab (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.2->autogluon)
  Downloading openxlab-0.1.1-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.1.0-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.38-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.37-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.36-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.35-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.34-py3-none-any.whl.metadata (3.8 kB)
INFO: pip is still looking at multiple versions of openxlab to determine which version is compatible with other requirements. This could take a while.
  Downloading openxlab-0.0.33-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.32-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.31-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.30-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.29-py3-none-any.whl.metadata (3.8 kB)
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
  Downloading openxlab-0.0.28-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.27-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.26-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.25-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.24-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.23-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.22-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.21-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.20-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.19-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.18-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.17-py3-none-any.whl.metadata (3.7 kB)
  Downloading openxlab-0.0.16-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.15-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.14-py3-none-any.whl.metadata (3.8 kB)
  Downloading openxlab-0.0.13-py3-none-any.whl.metadata (4.5 kB)
  Downloading openxlab-0.0.12-py3-none-any.whl.metadata (4.5 kB)
  Downloading openxlab-0.0.11-py3-none-any.whl.metadata (4.3 kB)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.11/dist-packages (from requests[socks]->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.2->autogluon) (1.7.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.11/dist-packages (from google-auth<3.0.0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (5.5.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.11/dist-packages (from google-auth<3.0.0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (0.4.2)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.11/dist-packages (from google-auth<3.0.0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (4.9)
Requirement already satisfied: linkify-it-py<3,>=1 in /usr/local/lib/python3.11/dist-packages (from markdown-it-py[linkify,plugins]>=2.1.0->textual>=0.41.0->memray->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (2.0.3)
Requirement already satisfied: mdit-py-plugins in /usr/local/lib/python3.11/dist-packages (from markdown-it-py[linkify,plugins]>=2.1.0->textual>=0.41.0->memray->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (0.4.2)
Requirement already satisfied: uc-micro-py in /usr/local/lib/python3.11/dist-packages (from linkify-it-py<3,>=1->markdown-it-py[linkify,plugins]>=2.1.0->textual>=0.41.0->memray->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (1.0.3)
Requirement already satisfied: pyasn1<0.7.0,>=0.6.1 in /usr/local/lib/python3.11/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3.0.0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.40,>=2.10.0; extra == "all"->autogluon.core[all]==1.2->autogluon) (0.6.1)
Downloading autogluon-1.2-py3-none-any.whl (9.6 kB)
Downloading autogluon.core-1.2-py3-none-any.whl (266 kB)
Downloading autogluon.features-1.2-py3-none-any.whl (64 kB)
Downloading autogluon.multimodal-1.2-py3-none-any.whl (429 kB)
Downloading autogluon.tabular-1.2-py3-none-any.whl (352 kB)
Downloading autogluon.timeseries-1.2-py3-none-any.whl (174 kB)
Downloading autogluon.common-1.2-py3-none-any.whl (68 kB)
Downloading coreforecast-0.0.12-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (196 kB)
Downloading mlforecast-0.13.4-py3-none-any.whl (70 kB)
Downloading accelerate-0.34.2-py3-none-any.whl (324 kB)
Downloading boto3-1.37.31-py3-none-any.whl (139 kB)
Downloading catboost-1.2.7-cp311-cp311-manylinux2014_x86_64.whl (98.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.7/98.7 MB 49.3 MB/s eta 0:00:00
?25hDownloading evaluate-0.4.3-py3-none-any.whl (84 kB)
Downloading fugue-0.9.1-py3-none-any.whl (278 kB)
Downloading gluonts-0.16.1-py3-none-any.whl (1.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 44.9 MB/s eta 0:00:00
?25hDownloading jsonschema-4.21.1-py3-none-any.whl (85 kB)
Downloading lightning-2.5.1-py3-none-any.whl (818 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 818.9/818.9 kB 31.9 MB/s eta 0:00:00
?25hDownloading nlpaug-1.1.11-py3-none-any.whl (410 kB)
Downloading nltk-3.8.1-py3-none-any.whl (1.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 51.8 MB/s eta 0:00:00
?25hDownloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 116.7 MB/s eta 0:00:00
?25hDownloading omegaconf-2.2.3-py3-none-any.whl (79 kB)
Downloading openmim-0.3.9-py2.py3-none-any.whl (52 kB)
Downloading pdf2image-1.17.0-py3-none-any.whl (11 kB)
Downloading pytesseract-0.3.10-py3-none-any.whl (14 kB)
Downloading pytorch_metric_learning-2.3.0-py3-none-any.whl (115 kB)
Downloading ray-2.39.0-cp311-cp311-manylinux2014_x86_64.whl (66.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.4/66.4 MB 49.9 MB/s eta 0:00:00
?25hDownloading scikit_image-0.24.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.9/14.9 MB 138.6 MB/s eta 0:00:00
?25hDownloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 126.4 MB/s eta 0:00:00
?25hDownloading spacy-3.7.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 110.9 MB/s eta 0:00:00
?25hDownloading statsforecast-1.7.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (315 kB)
Downloading timm-1.0.3-py3-none-any.whl (2.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 67.7 MB/s eta 0:00:00
?25hDownloading torchmetrics-1.2.1-py3-none-any.whl (806 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 806.1/806.1 kB 35.9 MB/s eta 0:00:00
?25hDownloading utilsforecast-0.2.4-py3-none-any.whl (40 kB)
Downloading pytorch_lightning-2.5.1-py3-none-any.whl (822 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.0/823.0 kB 34.3 MB/s eta 0:00:00
?25hDownloading adagio-0.2.6-py3-none-any.whl (19 kB)
Downloading botocore-1.37.31-py3-none-any.whl (13.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 113.8 MB/s eta 0:00:00
?25hDownloading datasets-3.5.0-py3-none-any.whl (491 kB)
Downloading dill-0.3.8-py3-none-any.whl (116 kB)
Downloading fsspec-2024.12.0-py3-none-any.whl (183 kB)
Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Downloading lightning_utilities-0.14.3-py3-none-any.whl (28 kB)
Downloading multiprocess-0.70.16-py311-none-any.whl (143 kB)
Downloading py_spy-0.4.0-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (2.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.7/2.7 MB 78.4 MB/s eta 0:00:00
?25hDownloading s3transfer-0.11.4-py3-none-any.whl (84 kB)
Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
Downloading thinc-8.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (920 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 920.2/920.2 kB 43.6 MB/s eta 0:00:00
?25hDownloading triad-0.9.8-py3-none-any.whl (62 kB)
Downloading virtualenv-20.30.0-py3-none-any.whl (4.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 98.4 MB/s eta 0:00:00
?25hDownloading aiohttp_cors-0.8.1-py3-none-any.whl (25 kB)
Downloading colorful-0.5.6-py2.py3-none-any.whl (201 kB)
Downloading memray-1.17.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.4/8.4 MB 112.5 MB/s eta 0:00:00
?25hDownloading model_index-0.1.11-py3-none-any.whl (34 kB)
Downloading opencensus-0.11.4-py2.py3-none-any.whl (128 kB)
Downloading opendatalab-0.0.10-py3-none-any.whl (29 kB)
Downloading optuna-4.2.1-py3-none-any.whl (383 kB)
Downloading window_ops-0.0.15-py3-none-any.whl (15 kB)
Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
Downloading alembic-1.15.2-py3-none-any.whl (231 kB)
Downloading blis-0.7.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 116.5 MB/s eta 0:00:00
?25hDownloading distlib-0.3.9-py2.py3-none-any.whl (468 kB)
Downloading opencensus_context-0.1.3-py2.py3-none-any.whl (5.1 kB)
Downloading textual-3.0.1-py3-none-any.whl (681 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 681.8/681.8 kB 28.4 MB/s eta 0:00:00
?25hDownloading colorlog-6.9.0-py3-none-any.whl (11 kB)
Downloading fs-2.4.16-py2.py3-none-any.whl (135 kB)
Downloading openxlab-0.0.11-py3-none-any.whl (55 kB)
Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Downloading pycryptodome-3.22.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 69.6 MB/s eta 0:00:00
?25hDownloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Building wheels for collected packages: nvidia-ml-py3, antlr4-python3-runtime, seqeval
  Building wheel for nvidia-ml-py3 (setup.py) ... ?25l?25hdone
  Created wheel for nvidia-ml-py3: filename=nvidia_ml_py3-7.352.0-py3-none-any.whl size=19208 sha256=cea8cdae30a7d723dbd4d300153d0fa3cf5e31193e73c616e53267a47aed40fd
  Stored in directory: /root/.cache/pip/wheels/47/50/9e/29dc79037d74c3c1bb4a8661fb608e8674b7e4260d6a3f8f51
  Building wheel for antlr4-python3-runtime (setup.py) ... ?25l?25hdone
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144592 sha256=a5c8ac798239fb940e700026abe699f55138fab23a1a92ab8e1a38472bba3125
  Stored in directory: /root/.cache/pip/wheels/1a/97/32/461f837398029ad76911109f07047fde1d7b661a147c7c56d1
  Building wheel for seqeval (setup.py) ... ?25l?25hdone
  Created wheel for seqeval: filename=seqeval-1.2.2-py3-none-any.whl size=16251 sha256=e0ec56925ffddd1aac621da17b4153e6855c275893dd491020333c0338e93331
  Stored in directory: /root/.cache/pip/wheels/bc/92/f0/243288f899c2eacdfa8c5f9aede4c71a9bad0ee26a01dc5ead
Successfully built nvidia-ml-py3 antlr4-python3-runtime seqeval
Installing collected packages: py-spy, opencensus-context, nvidia-ml-py3, distlib, colorful, appdirs, antlr4-python3-runtime, xxhash, virtualenv, pytesseract, pycryptodome, pdf2image, ordered-set, openxlab, omegaconf, numpy, nltk, lightning-utilities, jmespath, fsspec, fs, dill, colorlog, colorama, torch, tensorboardX, multiprocess, model-index, coreforecast, botocore, blis, alembic, window-ops, utilsforecast, triad, torchvision, torchmetrics, scikit-learn, scikit-image, s3transfer, optuna, opendatalab, jsonschema, gluonts, aiohttp-cors, accelerate, timm, thinc, textual, seqeval, ray, pytorch-metric-learning, pytorch-lightning, openmim, opencensus, nlpaug, mlforecast, datasets, catboost, boto3, adagio, spacy, memray, lightning, fugue, evaluate, autogluon.common, statsforecast, autogluon.features, autogluon.core, autogluon.tabular, autogluon.multimodal, autogluon.timeseries, autogluon
  Attempting uninstall: numpy
    Found existing installation: numpy 2.0.2
    Uninstalling numpy-2.0.2:
      Successfully uninstalled numpy-2.0.2
  Attempting uninstall: nltk
    Found existing installation: nltk 3.9.1
    Uninstalling nltk-3.9.1:
      Successfully uninstalled nltk-3.9.1
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2025.3.2
    Uninstalling fsspec-2025.3.2:
      Successfully uninstalled fsspec-2025.3.2
  Attempting uninstall: torch
    Found existing installation: torch 2.6.0+cu124
    Uninstalling torch-2.6.0+cu124:
      Successfully uninstalled torch-2.6.0+cu124
  Attempting uninstall: blis
    Found existing installation: blis 1.2.1
    Uninstalling blis-1.2.1:
      Successfully uninstalled blis-1.2.1
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.21.0+cu124
    Uninstalling torchvision-0.21.0+cu124:
      Successfully uninstalled torchvision-0.21.0+cu124
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.6.1
    Uninstalling scikit-learn-1.6.1:
      Successfully uninstalled scikit-learn-1.6.1
  Attempting uninstall: scikit-image
    Found existing installation: scikit-image 0.25.2
    Uninstalling scikit-image-0.25.2:
      Successfully uninstalled scikit-image-0.25.2
  Attempting uninstall: jsonschema
    Found existing installation: jsonschema 4.23.0
    Uninstalling jsonschema-4.23.0:
      Successfully uninstalled jsonschema-4.23.0
  Attempting uninstall: accelerate
    Found existing installation: accelerate 1.5.2
    Uninstalling accelerate-1.5.2:
      Successfully uninstalled accelerate-1.5.2
  Attempting uninstall: timm
    Found existing installation: timm 1.0.15
    Uninstalling timm-1.0.15:
      Successfully uninstalled timm-1.0.15
  Attempting uninstall: thinc
    Found existing installation: thinc 8.3.4
    Uninstalling thinc-8.3.4:
      Successfully uninstalled thinc-8.3.4
  Attempting uninstall: spacy
    Found existing installation: spacy 3.8.5
    Uninstalling spacy-3.8.5:
      Successfully uninstalled spacy-3.8.5
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchaudio 2.6.0+cu124 requires torch==2.6.0, but you have torch 2.5.1+cpu which is incompatible.
textblob 0.19.0 requires nltk>=3.9, but you have nltk 3.8.1 which is incompatible.
gcsfs 2025.3.2 requires fsspec==2025.3.2, but you have fsspec 2024.12.0 which is incompatible.
Successfully installed accelerate-0.34.2 adagio-0.2.6 aiohttp-cors-0.8.1 alembic-1.15.2 antlr4-python3-runtime-4.9.3 appdirs-1.4.4 autogluon-1.2 autogluon.common-1.2 autogluon.core-1.2 autogluon.features-1.2 autogluon.multimodal-1.2 autogluon.tabular-1.2 autogluon.timeseries-1.2 blis-0.7.11 boto3-1.37.31 botocore-1.37.31 catboost-1.2.7 colorama-0.4.6 colorful-0.5.6 colorlog-6.9.0 coreforecast-0.0.12 datasets-3.5.0 dill-0.3.8 distlib-0.3.9 evaluate-0.4.3 fs-2.4.16 fsspec-2024.12.0 fugue-0.9.1 gluonts-0.16.1 jmespath-1.0.1 jsonschema-4.21.1 lightning-2.5.1 lightning-utilities-0.14.3 memray-1.17.0 mlforecast-0.13.4 model-index-0.1.11 multiprocess-0.70.16 nlpaug-1.1.11 nltk-3.8.1 numpy-1.26.4 nvidia-ml-py3-7.352.0 omegaconf-2.2.3 opencensus-0.11.4 opencensus-context-0.1.3 opendatalab-0.0.10 openmim-0.3.9 openxlab-0.0.11 optuna-4.2.1 ordered-set-4.1.0 pdf2image-1.17.0 py-spy-0.4.0 pycryptodome-3.22.0 pytesseract-0.3.10 pytorch-lightning-2.5.1 pytorch-metric-learning-2.3.0 ray-2.39.0 s3transfer-0.11.4 scikit-image-0.24.0 scikit-learn-1.5.2 seqeval-1.2.2 spacy-3.7.5 statsforecast-1.7.8 tensorboardX-2.6.2.2 textual-3.0.1 thinc-8.2.5 timm-1.0.3 torch-2.5.1+cpu torchmetrics-1.2.1 torchvision-0.20.1+cpu triad-0.9.8 utilsforecast-0.2.4 virtualenv-20.30.0 window-ops-0.0.15 xxhash-3.5.0

CPU times: user 1.24 s, sys: 365 ms, total: 1.6 s
Wall time: 2min 47s

from autogluon.tabular import TabularPredictor
import pandas as pd
from sklearn.model_selection import train_test_split

# Read data
df = pd.read_csv('NLP_data/Sentences_AllAgree.txt', sep=".@", header=None, engine='python', encoding = "ISO-8859-1")  # Finbert data
# df = pd.read_csv('NLP_data/Sentences_AllAgree.txt', sep=".@", header=None, engine='python', encoding = "utf-8")  # Finbert data
# tmp = pd.read_csv('NLP_data/Sentences_75Agree.txt', sep=".@", header=None, engine='python')
# df = pd.concat([df,tmp])
# tmp = pd.read_csv('NLP_data/Sentences_66Agree.txt', sep=".@", header=None, engine='python')
# df = pd.concat([df,tmp])
# tmp = pd.read_csv('NLP_data/Sentences_50Agree.txt', sep=".@", header=None, engine='python')
# df = pd.concat([df,tmp])
df.columns = ["Text","Label"]
print(df.shape)
df.head()

(2264, 2)

	Text	Label
0	According to Gran , the company has no plans t...	neutral
1	For the last quarter of 2010 , Componenta 's n...	positive
2	In the third quarter of 2010 , net sales incre...	positive
3	Operating profit rose to EUR 13.1 mn from EUR ...	positive
4	Operating profit totalled EUR 21.1 mn , up fro...	positive

import seaborn as sns
import matplotlib.pyplot as plt
sns.countplot(x='Label', data=df)
plt.show()

_images/39c8e561d69fa4728b2fdadea809ec724f8f6cef9cca95f7207cde74c63a3b6e.png

27.2. Fit the model#

The next few lines of code are all that are needed to train the model. It is remarkable in its parsimony!

The vectorization of the text adjusts the size of the vocabulary so that it uses the available memory efficiently.

!pip install dask[dataframe] --quiet

%%time
#TRAIN THE MODEL

train_data, test_data = train_test_split(df, test_size=0.3, random_state=42)
print("Train size =",train_data.shape," | Test size =",test_data.shape)

predictor = TabularPredictor(label='Label').fit(train_data=train_data) #,    hyperparameters='multimodal')

# predictor = task.fit(train_data=train_data, label='Label')
performance = predictor.evaluate(train_data)

No path specified. Models will be saved in: "AutogluonModels/ag-20250410_125949"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.2
Python Version:     3.11.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          2
Memory Avail:       11.56 GB / 12.67 GB (91.2%)
Disk Space Avail:   65.27 GB / 107.72 GB (60.6%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.

Train size = (1584, 2)  | Test size = (680, 2)

Beginning AutoGluon training ...
AutoGluon will save models to "/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250410_125949"
Train Data Rows:    1584
Train Data Columns: 1
Label Column:       Label
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == object).
	3 unique label values:  ['neutral', 'positive', 'negative']
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       multiclass
Preprocessing data ...
Train Data Class Count: 3
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    11821.04 MB
	Train Data (Original)  Memory Usage: 0.27 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
		Fitting TextSpecialFeatureGenerator...
			Fitting BinnedFeatureGenerator...
			Fitting DropDuplicatesFeatureGenerator...
		Fitting TextNgramFeatureGenerator...
			Fitting CountVectorizer for text features: ['Text']
			CountVectorizer fit with vocabulary size = 186
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('object', ['text']) : 1 | ['Text']
	Types of features in processed data (raw dtype, special dtypes):
		('category', ['text_as_category'])  :   1 | ['Text']
		('int', ['binned', 'text_special']) :  20 | ['Text.char_count', 'Text.word_count', 'Text.capital_ratio', 'Text.lower_ratio', 'Text.digit_ratio', ...]
		('int', ['text_ngram'])             : 180 | ['__nlp__.000', '__nlp__.10', '__nlp__.11', '__nlp__.12', '__nlp__.20', ...]
	8.5s = Fit runtime
	1 features in original data used to generate 201 features in processed data.
	Train Data (Processed) Memory Usage: 0.58 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 8.63s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 1267, Val Rows: 317
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ...
Exception ignored on calling ctypes callback function: <function ThreadpoolController._find_libraries_with_dl_iterate_phdr.<locals>.match_library_callback at 0x7c4b4f810720>
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/threadpoolctl.py", line 1005, in match_library_callback
    self._make_controller_from_path(filepath)
  File "/usr/local/lib/python3.11/dist-packages/threadpoolctl.py", line 1187, in _make_controller_from_path
    lib_controller = controller_class(
                     ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/threadpoolctl.py", line 114, in __init__
    self.dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /usr/local/lib/python3.11/dist-packages/numpy.libs/libscipy_openblas64_-99b71e71.so: cannot open shared object file: No such file or directory
	0.7634	 = Validation score   (accuracy)
	5.34s	 = Training   runtime
	0.1s	 = Validation runtime
Fitting model: KNeighborsDist ...
	0.776	 = Validation score   (accuracy)
	0.04s	 = Training   runtime
	0.02s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
	0.7382	 = Validation score   (accuracy)
	7.24s	 = Training   runtime
	0.04s	 = Validation runtime
Fitting model: LightGBMXT ...
	0.8801	 = Validation score   (accuracy)
	19.72s	 = Training   runtime
	0.03s	 = Validation runtime
Fitting model: LightGBM ...
	0.8549	 = Validation score   (accuracy)
	2.97s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: RandomForestGini ...
	0.8612	 = Validation score   (accuracy)
	2.47s	 = Training   runtime
	0.09s	 = Validation runtime
Fitting model: RandomForestEntr ...
	0.8549	 = Validation score   (accuracy)
	3.87s	 = Training   runtime
	0.11s	 = Validation runtime
Fitting model: CatBoost ...
	Warning: Exception caused CatBoost to fail during training (ImportError)... Skipping this model.
		Import catboost failed. Numpy version may be outdated, Please ensure numpy version >=1.17.0. If it is not, please try 'pip uninstall numpy -y; pip install numpy>=1.17.0' Detailed info: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Fitting model: ExtraTreesGini ...
	0.8549	 = Validation score   (accuracy)
	2.43s	 = Training   runtime
	0.09s	 = Validation runtime
Fitting model: ExtraTreesEntr ...
	0.8612	 = Validation score   (accuracy)
	2.32s	 = Training   runtime
	0.12s	 = Validation runtime
Fitting model: XGBoost ...
	0.858	 = Validation score   (accuracy)
	4.09s	 = Training   runtime
	0.02s	 = Validation runtime
Fitting model: NeuralNetTorch ...
	0.7224	 = Validation score   (accuracy)
	8.62s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: LightGBMLarge ...
	0.8644	 = Validation score   (accuracy)
	2.98s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'LightGBMXT': 1.0}
	0.8801	 = Validation score   (accuracy)
	0.11s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 74.43s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 10982.5 rows/s (317 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250410_125949")

CPU times: user 42.6 s, sys: 2.69 s, total: 45.3 s
Wall time: 1min 15s

# TEST OUT-OF-SAMPLE

y_test = test_data['Label']
test_data_nolabel = test_data.drop(labels=['Label'],axis=1)
y_pred = predictor.predict(test_data_nolabel)
y_prob = predictor.predict(test_data_nolabel)
perf = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)
print(perf)

{'accuracy': 0.8720588235294118, 'balanced_accuracy': np.float64(0.7908290922121237), 'mcc': np.float64(0.7572923133729487)}

predictor.leaderboard(test_data, silent=True)

/usr/local/lib/python3.11/dist-packages/fastai/learner.py:455: UserWarning: load_learner` uses Python's insecure pickle module, which can execute malicious arbitrary code when loading. Only load files you trust.
If you only need to load model weights and optimizer state, use the safe `Learner.load` instead.
  warn("load_learner` uses Python's insecure pickle module, which can execute malicious arbitrary code when loading. Only load files you trust.\nIf you only need to load model weights and optimizer state, use the safe `Learner.load` instead.")

	model	score_test	score_val	eval_metric	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	LightGBMXT	0.872059	0.880126	accuracy	0.040911	0.027874	19.715188	0.040911	0.027874	19.715188	1	True	4
1	WeightedEnsemble_L2	0.872059	0.880126	accuracy	0.047272	0.028864	19.820758	0.006361	0.000990	0.105569	2	True	13
2	LightGBM	0.867647	0.854890	accuracy	0.026660	0.010633	2.969520	0.026660	0.010633	2.969520	1	True	5
3	XGBoost	0.854412	0.858044	accuracy	0.092813	0.021459	4.090568	0.092813	0.021459	4.090568	1	True	10
4	ExtraTreesEntr	0.839706	0.861199	accuracy	0.145543	0.120176	2.315738	0.145543	0.120176	2.315738	1	True	9
5	RandomForestGini	0.838235	0.861199	accuracy	0.165986	0.094227	2.474920	0.165986	0.094227	2.474920	1	True	6
6	ExtraTreesGini	0.833824	0.854890	accuracy	0.173899	0.087760	2.433944	0.173899	0.087760	2.433944	1	True	8
7	LightGBMLarge	0.830882	0.864353	accuracy	0.032559	0.007579	2.982852	0.032559	0.007579	2.982852	1	True	12
8	RandomForestEntr	0.827941	0.854890	accuracy	0.180395	0.109229	3.870272	0.180395	0.109229	3.870272	1	True	7
9	KNeighborsDist	0.704412	0.776025	accuracy	0.028021	0.016762	0.039695	0.028021	0.016762	0.039695	1	True	2
10	KNeighborsUnif	0.702941	0.763407	accuracy	0.027316	0.102499	5.339034	0.027316	0.102499	5.339034	1	True	1
11	NeuralNetTorch	0.682353	0.722397	accuracy	0.027745	0.013661	8.619121	0.027745	0.013661	8.619121	1	True	11
12	NeuralNetFastAI	0.664706	0.738170	accuracy	0.035750	0.036690	7.242823	0.035750	0.036690	7.242823	1	True	3

27.3. Metrics#

https://en.wikipedia.org/wiki/Receiver_operating_characteristic

https://srdas.github.io/MLBook2/3_MachineLearningOverview.html

https://srdas.github.io/MLBook2/3_MachineLearningOverview.html#ROC-and-AUC

27.4. Movie Reviews, one more time, with AG-Tabular#

train_data = pd.read_csv("NLP_data/movie_review_train.txt", sep = " ", header=None)
test_data = pd.read_csv("NLP_data/movie_review_test.txt", sep = " ", header=None)
train_data.columns = ['Label','Text']
test_data.columns = ['Label','Text']
print(train_data.shape, test_data.shape)
train_data.head()

(4001, 2) (1000, 2)

	Label	Text
0	__label__0	Homelessness (or Houselessness as George Carlin stated) has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school, work, or vote for the matter. Most people think of the homeless as just a lost cause while worrying about things such as racism, the war on Iraq, pressuring kids to succeed, technology, the elections, inflation, or worrying if they'll be next to end up on the streets.<br /><br />But what if you were given a bet to live on the streets for a month without the luxuries you once had from a home,...
1	__label__1	This film lacked something I couldn't put my finger on at first: charisma on the part of the leading actress. This inevitably translated to lack of chemistry when she shared the screen with her leading man. Even the romantic scenes came across as being merely the actors at play. It could very well have been the director who miscalculated what he needed from the actors. I just don't know.<br /><br />But could it have been the screenplay? Just exactly who was the chef in love with? He seemed more enamored of his culinary skills and restaurant, and ultimately of himself and his youthful explo...
2	__label__1	\"It appears that many critics find the idea of a Woody Allen drama unpalatable.\" And for good reason: they are unbearably wooden and pretentious imitations of Bergman. And let's not kid ourselves: critics were mostly supportive of Allen's Bergman pretensions, Allen's whining accusations to the contrary notwithstanding. What I don't get is this: why was Allen generally applauded for his originality in imitating Bergman, but the contemporaneous Brian DePalma was excoriated for \"ripping off\" Hitchcock in his suspense/horror films? In Robin Wood's view, it's a strange form of cultural snob...
3	__label__0	This isn't the comedic Robin Williams, nor is it the quirky/insane Robin Williams of recent thriller fame. This is a hybrid of the classic drama without over-dramatization, mixed with Robin's new love of the thriller. But this isn't a thriller, per se. This is more a mystery/suspense vehicle through which Williams attempts to locate a sick boy and his keeper.<br /><br />Also starring Sandra Oh and Rory Culkin, this Suspense Drama plays pretty much like a news report, until William's character gets close to achieving his goal.<br /><br />I must say that I was highly entertained, though this...
4	__label__1	I don't know who to blame, the timid writers or the clueless director. It seemed to be one of those movies where so much was paid to the stars (Angie, Charlie, Denise, Rosanna and Jon) that there wasn't enough left to really make a movie. This could have been very entertaining, but there was a veil of timidity, even cowardice, that hung over each scene. Since it got an R rating anyway why was the ubiquitous bubble bath scene shot with a 70-year-old woman and not Angie Harmon? Why does Sheen sleepwalk through potentially hot relationships WITH TWO OF THE MOST BEAUTIFUL AND SEXY ACTRESSES in...

%%time
#TRAIN THE MODEL

print("Train size =",train_data.shape," | Test size =",test_data.shape)

predictor = TabularPredictor(label='Label').fit(train_data=train_data) #,    hyperparameters='multimodal')
performance = predictor.evaluate(train_data)

No path specified. Models will be saved in: "AutogluonModels/ag-20250409_014135"
Verbosity: 2 (Standard Logging)
=================== System Info ===================
AutoGluon Version:  1.2
Python Version:     3.11.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          2
Memory Avail:       10.97 GB / 12.67 GB (86.6%)
Disk Space Avail:   63.70 GB / 112.64 GB (56.5%)
===================================================
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets. Defaulting to `'medium'`...
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='experimental' : New in v1.2: Pre-trained foundation model + parallel fits. The absolute best accuracy without consideration for inference speed. Does not support GPU.
	presets='best'         : Maximize accuracy. Recommended for most users. Use in competitions and benchmarks.
	presets='high'         : Strong accuracy with fast inference speed.
	presets='good'         : Good accuracy with very fast inference speed.
	presets='medium'       : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ...
AutoGluon will save models to "/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250409_014135"
Train Data Rows:    4001
Train Data Columns: 1
Label Column:       Label
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
	2 unique label values:  ['__label__0', '__label__1']
	If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during Predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression', 'quantile'])
Problem Type:       binary
Preprocessing data ...
Selected class <--> label mapping:  class 1 = __label__1, class 0 = __label__0
	Note: For your binary classification, AutoGluon arbitrarily selected which label-value represents positive (__label__1) vs negative (__label__0) class.
	To explicitly set the positive_class, either rename classes to 1 and 0, or specify positive_class in Predictor init.
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    11243.66 MB
	Train Data (Original)  Memory Usage: 5.25 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.

Train size = (4001, 2)  | Test size = (1000, 2)

	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
		Fitting TextSpecialFeatureGenerator...
			Fitting BinnedFeatureGenerator...
			Fitting DropDuplicatesFeatureGenerator...
		Fitting TextNgramFeatureGenerator...
			Fitting CountVectorizer for text features: ['Text']
			CountVectorizer fit with vocabulary size = 5515
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('object', ['text']) : 1 | ['Text']
	Types of features in processed data (raw dtype, special dtypes):
		('category', ['text_as_category'])  :    1 | ['Text']
		('int', ['binned', 'text_special']) :   30 | ['Text.char_count', 'Text.word_count', 'Text.capital_ratio', 'Text.lower_ratio', 'Text.digit_ratio', ...]
		('int', ['text_ngram'])             : 5437 | ['__nlp__.000', '__nlp__.10', '__nlp__.10 10', '__nlp__.100', '__nlp__.11', ...]
	56.4s = Fit runtime
	1 features in original data used to generate 5468 features in processed data.
	Train Data (Processed) Memory Usage: 41.61 MB (0.4% of available memory)
Data preprocessing and feature engineering runtime = 57.17s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric parameter of Predictor()
Automatically generating train/validation split with holdout_frac=0.12496875781054737, Train Rows: 3501, Val Rows: 500
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, {'learning_rate': 0.03, 'num_leaves': 128, 'feature_fraction': 0.9, 'min_data_in_leaf': 3, 'ag_args': {'name_suffix': 'Large', 'priority': 0, 'hyperparameter_tune_kwargs': None}}],
	'CAT': [{}],
	'XGB': [{}],
	'FASTAI': [{}],
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models, fit_strategy="sequential" ...
Fitting model: KNeighborsUnif ...
	0.59	 = Validation score   (accuracy)
	1.89s	 = Training   runtime
	0.54s	 = Validation runtime
Fitting model: KNeighborsDist ...
	0.59	 = Validation score   (accuracy)
	1.95s	 = Training   runtime
	0.81s	 = Validation runtime
Fitting model: LightGBMXT ...
	0.88	 = Validation score   (accuracy)
	11.53s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: LightGBM ...
	0.868	 = Validation score   (accuracy)
	13.54s	 = Training   runtime
	0.08s	 = Validation runtime
Fitting model: RandomForestGini ...
	0.856	 = Validation score   (accuracy)
	12.91s	 = Training   runtime
	0.09s	 = Validation runtime
Fitting model: RandomForestEntr ...
	0.846	 = Validation score   (accuracy)
	13.4s	 = Training   runtime
	0.13s	 = Validation runtime
Fitting model: CatBoost ...
	Warning: Exception caused CatBoost to fail during training (ImportError)... Skipping this model.
		Import catboost failed. Numpy version may be outdated, Please ensure numpy version >=1.17.0. If it is not, please try 'pip uninstall numpy -y; pip install numpy>=1.17.0' Detailed info: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
Fitting model: ExtraTreesGini ...
	0.842	 = Validation score   (accuracy)
	15.88s	 = Training   runtime
	0.1s	 = Validation runtime
Fitting model: ExtraTreesEntr ...
	0.858	 = Validation score   (accuracy)
	15.53s	 = Training   runtime
	0.09s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
No improvement since epoch 7: early stopping
	0.62	 = Validation score   (accuracy)
	4.29s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: XGBoost ...
	0.85	 = Validation score   (accuracy)
	106.51s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: NeuralNetTorch ...
	0.628	 = Validation score   (accuracy)
	7.81s	 = Training   runtime
	0.02s	 = Validation runtime
Fitting model: LightGBMLarge ...
	0.842	 = Validation score   (accuracy)
	42.73s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'LightGBMXT': 1.0}
	0.88	 = Validation score   (accuracy)
	0.07s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 312.4s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 10347.1 rows/s (500 batch size)
Disabling decision threshold calibration for metric `accuracy` due to having fewer than 10000 rows of validation data for calibration, to avoid overfitting (500 rows).
	`accuracy` is generally not improved through threshold calibration. Force calibration via specifying `calibrate_decision_threshold=True`.
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/drive/My Drive/Books_Writings/NLPBook/AutogluonModels/ag-20250409_014135")

CPU times: user 5min 27s, sys: 5.25 s, total: 5min 32s
Wall time: 5min 16s

%%time
# TEST OUT-OF-SAMPLE

y_test = test_data['Label']
test_data_nolabel = test_data.drop(labels=['Label'],axis=1)
y_pred = predictor.predict(test_data_nolabel)
y_prob = predictor.predict_proba(test_data_nolabel)
perf = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)
print(perf)

{'accuracy': 0.842, 'balanced_accuracy': np.float64(0.8420313681254725), 'mcc': np.float64(0.6843694036793613), 'f1': np.float64(0.8397565922920892), 'precision': np.float64(0.8536082474226804), 'recall': np.float64(0.8263473053892215)}
CPU times: user 2.26 s, sys: 31.7 ms, total: 2.29 s
Wall time: 2.59 s

#ROC, AUC
import numpy as np
from sklearn.metrics import roc_curve, auc
y_score = [1 if y_prob.loc[i][1]>y_prob.loc[i][0] else 0 for i in range(len(y_prob)) ]
y_true = np.array([1 if j=="__label__1" else 0 for j in y_test])
fpr, tpr, _ = roc_curve(y_true, y_score)

plt.title('ROC curve')
plt.xlabel('FPR (Precision)')
plt.ylabel('TPR (Recall)')

plt.plot(fpr,tpr)
plt.plot((0,1), ls='dashed',color='black')
plt.show()
print('Area under curve (AUC): ', auc(fpr,tpr))

<ipython-input-13-be9385ae72ab>:4: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  y_score = [1 if y_prob.loc[i][1]>y_prob.loc[i][0] else 0 for i in range(len(y_prob)) ]

_images/8c4bbd9ae98a32875fb88c176c39579a2ce4fe9886f599aceed4d36121695217.png

Area under curve (AUC):  0.8420313681254725

predictor.leaderboard(test_data, silent=True)

/usr/local/lib/python3.11/dist-packages/fastai/learner.py:455: UserWarning: load_learner` uses Python's insecure pickle module, which can execute malicious arbitrary code when loading. Only load files you trust.
If you only need to load model weights and optimizer state, use the safe `Learner.load` instead.
  warn("load_learner` uses Python's insecure pickle module, which can execute malicious arbitrary code when loading. Only load files you trust.\nIf you only need to load model weights and optimizer state, use the safe `Learner.load` instead.")

	model	score_test	score_val	eval_metric	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	ExtraTreesGini	0.848	0.842	accuracy	0.203905	0.102248	15.882477	0.203905	0.102248	15.882477	1	True	7
1	LightGBMXT	0.842	0.880	accuracy	0.133751	0.047568	11.533216	0.133751	0.047568	11.533216	1	True	3
2	WeightedEnsemble_L2	0.842	0.880	accuracy	0.139639	0.048323	11.607130	0.005888	0.000755	0.073915	2	True	13
3	ExtraTreesEntr	0.841	0.858	accuracy	0.198025	0.092897	15.526747	0.198025	0.092897	15.526747	1	True	8
4	LightGBM	0.833	0.868	accuracy	0.174953	0.079984	13.537488	0.174953	0.079984	13.537488	1	True	4
5	RandomForestGini	0.828	0.856	accuracy	0.205734	0.092756	12.905852	0.205734	0.092756	12.905852	1	True	5
6	XGBoost	0.819	0.850	accuracy	0.142225	0.047694	106.505279	0.142225	0.047694	106.505279	1	True	10
7	RandomForestEntr	0.815	0.846	accuracy	0.193312	0.134340	13.403086	0.193312	0.134340	13.403086	1	True	6
8	LightGBMLarge	0.805	0.842	accuracy	0.166489	0.057016	42.730621	0.166489	0.057016	42.730621	1	True	12
9	NeuralNetFastAI	0.567	0.620	accuracy	0.030178	0.010674	4.290813	0.030178	0.010674	4.290813	1	True	9
10	KNeighborsDist	0.567	0.590	accuracy	1.180527	0.809882	1.947131	1.180527	0.809882	1.947131	1	True	2
11	KNeighborsUnif	0.567	0.590	accuracy	1.362142	0.540332	1.889514	1.362142	0.540332	1.889514	1	True	1
12	NeuralNetTorch	0.541	0.628	accuracy	0.025236	0.015067	7.805507	0.025236	0.015067	7.805507	1	True	11

The hyperparameter option multimodal below will also do hyperparameter tuning, which takes considerable time, so wait for it to finish. Maybe best to skip over this code segment.

%%time

# predictor = TabularPredictor(label='Label').fit(train_data=train_data, hyperparameters='multimodal')
# y_test = test_data['Label']
# test_data_nolabel = test_data.drop(labels=['Label'],axis=1)
# y_pred = predictor.predict(test_data_nolabel)
# perf = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)

CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 6.44 µs

27.5. Multimodal Extension#

AutoGluon can also handle images in addition to text and here is an example from their library from: https://auto.gluon.ai/stable/tutorials/multimodal/multimodal_prediction/beginner_multimodal.html

import os
import numpy as np
import warnings
warnings.filterwarnings('ignore')
np.random.seed(123)

%%time
download_dir = './ag_automm_tutorial'
zip_file = 'https://automl-mm-bench.s3.amazonaws.com/petfinder_for_tutorial.zip'
from autogluon.core.utils.loaders import load_zip
load_zip.unzip(zip_file, unzip_dir=download_dir)

Unzipping ./ag_automm_tutorial/file.zip to ./ag_automm_tutorial

CPU times: user 1.86 s, sys: 448 ms, total: 2.31 s
Wall time: 6min 1s

import pandas as pd
download_dir = './ag_automm_tutorial'
dataset_path = download_dir + '/petfinder_for_tutorial'
train_data = pd.read_csv(f'{dataset_path}/train.csv', index_col=0)
test_data = pd.read_csv(f'{dataset_path}/test.csv', index_col=0)
label_col = 'AdoptionSpeed'

train_data.head()

	Type	Name	Age	Breed1	Breed2	Gender	Color1	Color2	Color3	MaturitySize	...	Quantity	Fee	State	RescuerID	Description	PetID	PhotoAmt	Images
0	2	Yumi Hamasaki	4	292	265	2	1	5	7	2	...	1	0	41326	bcc4e1b9557a8b3aaf545ea8e6e86991	I rescued Yumi Hamasaki at a food stall far away in Kelantan. At that time i was on my way back to KL, she was suffer from stomach problem and looking very2 sick.. I send her to vet & get the treatment + vaccinated and right now she's very2 healthy.. About yumi : - love to sleep with ppl - she will keep on meowing if she's hugry - very2 active, always seeking for people to accompany her playing - well trained (poo+pee in her own potty) - easy to bathing - I only feed her with these brands : IAMS, Kittenbites, Pro-formance Reason why i need someone to adopt Yumi: I just married and need to ...	7d7a39d71	3.0	images/7d7a39d71-1.jpg
1	2	Nene/ Kimie	12	285	0	2	5	6	7	2	...	1	0	41326	f0450bf0efe0fa3ff9321d0b827b1237	Has adopted by a friend with new pet name Kimie	0e107c82f	3.0	images/0e107c82f-1.jpg
2	2	Mattie	12	266	0	2	1	7	0	2	...	1	0	41401	9b52af6d48a4521fd01d4028eb5879a3	I rescued Mattie with a broken leg. After surgery with pin inserted in her leg, she's made a full recovery.	1a8fd6707	5.0	images/1a8fd6707-1.jpg
3	1	NaN	1	189	307	2	1	2	0	2	...	1	0	41401	88da1210e021a5cf43480b074778f3bc	She born on 30 September . I really hope the animal lovers can adopt her.	bca8b44ae	3.0	images/bca8b44ae-1.jpg
4	2	Coco	6	276	285	2	2	4	7	2	...	1	100	41326	227d7b1bcfaffb5f9882bf57b5ee8fab	Calico Tame and easy going Diet RC Kitten Supplement - brewer yeast + VCO *11.7.17 - Coco had found her new home.	2def67952	1.0	images/2def67952-1.jpg

5 rows × 25 columns

# Expand image paths for loading in training

image_col = 'Images'
train_data[image_col] = train_data[image_col].apply(lambda ele: ele.split(';')[0]) # Use the first image for a quick tutorial
test_data[image_col] = test_data[image_col].apply(lambda ele: ele.split(';')[0])


def path_expander(path, base_folder):
    path_l = path.split(';')
    return ';'.join([os.path.abspath(os.path.join(base_folder, path)) for path in path_l])

train_data[image_col] = train_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))
test_data[image_col] = test_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))

train_data[image_col].iloc[0]

'/content/drive/My Drive/Books_Writings/NLPBook/ag_automm_tutorial/petfinder_for_tutorial/images/7d7a39d71-1.jpg'

example_row = train_data.iloc[0]

example_row

	0
Type	2
Name	Yumi Hamasaki
Age	4
Breed1	292
Breed2	265
Gender	2
Color1	1
Color2	5
Color3	7
MaturitySize	2
FurLength	2
Vaccinated	1
Dewormed	3
Sterilized	2
Health	1
Quantity	1
Fee	0
State	41326
RescuerID	bcc4e1b9557a8b3aaf545ea8e6e86991
VideoAmt	0
Description	I rescued Yumi Hamasaki at a food stall far away in Kelantan. At that time i was on my way back to KL, she was suffer from stomach problem and looking very2 sick.. I send her to vet & get the treatment + vaccinated and right now she's very2 healthy.. About yumi : - love to sleep with ppl - she will keep on meowing if she's hugry - very2 active, always seeking for people to accompany her playing - well trained (poo+pee in her own potty) - easy to bathing - I only feed her with these brands : IAMS, Kittenbites, Pro-formance Reason why i need someone to adopt Yumi: I just married and need to ...
PetID	7d7a39d71
PhotoAmt	3.0
AdoptionSpeed	0
Images	/content/drive/My Drive/Books_Writings/NLPBook/ag_automm_tutorial/petfinder_for_tutorial/images/7d7a39d71-1.jpg

dtype: object

example_image = example_row[image_col]

from IPython.display import Image, display
pil_img = Image(filename=example_image)
display(pil_img)

_images/2c51d770716edcf63f3988523cbc934d4fc245ed0f8ee91864e7d7da6f3f1c81.jpg

%%time

from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(label=label_col)
predictor.fit(
    train_data=train_data,
    time_limit=120, # seconds
)

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<timed exec> in <module>

/usr/local/lib/python3.11/dist-packages/autogluon/multimodal/__init__.py in <module>
      4     pass
      5 
----> 6 from . import constants, data, learners, models, optimization, predictor, problem_types, utils
      7 from .predictor import MultiModalPredictor
      8 from .utils import download

/usr/local/lib/python3.11/dist-packages/autogluon/multimodal/data/__init__.py in <module>
      1 from . import collator, infer_types, randaug, utils
----> 2 from .datamodule import BaseDataModule
      3 from .dataset import BaseDataset
      4 from .dataset_mmlab import MultiImageMixDataset
      5 from .infer_types import (

/usr/local/lib/python3.11/dist-packages/autogluon/multimodal/data/datamodule.py in <module>
      2 
      3 import pandas as pd
----> 4 from lightning.pytorch import LightningDataModule
      5 from torch.utils.data import DataLoader, Dataset
      6 

/usr/local/lib/python3.11/dist-packages/lightning/__init__.py in <module>
     18 from lightning.fabric.fabric import Fabric  # noqa: E402
     19 from lightning.fabric.utilities.seed import seed_everything  # noqa: E402
---> 20 from lightning.pytorch.callbacks import Callback  # noqa: E402
     21 from lightning.pytorch.core import LightningDataModule, LightningModule  # noqa: E402
     22 from lightning.pytorch.trainer import Trainer  # noqa: E402

/usr/local/lib/python3.11/dist-packages/lightning/pytorch/__init__.py in <module>
     25 from lightning.fabric.utilities.seed import seed_everything  # noqa: E402
     26 from lightning.fabric.utilities.warnings import disable_possible_user_warnings  # noqa: E402
---> 27 from lightning.pytorch.callbacks import Callback  # noqa: E402
     28 from lightning.pytorch.core import LightningDataModule, LightningModule  # noqa: E402
     29 from lightning.pytorch.trainer import Trainer  # noqa: E402

/usr/local/lib/python3.11/dist-packages/lightning/pytorch/callbacks/__init__.py in <module>
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from lightning.pytorch.callbacks.batch_size_finder import BatchSizeFinder
     15 from lightning.pytorch.callbacks.callback import Callback
     16 from lightning.pytorch.callbacks.checkpoint import Checkpoint

/usr/local/lib/python3.11/dist-packages/lightning/pytorch/callbacks/batch_size_finder.py in <module>
     24 
     25 import lightning.pytorch as pl
---> 26 from lightning.pytorch.callbacks.callback import Callback
     27 from lightning.pytorch.tuner.batch_size_scaling import _scale_batch_size
     28 from lightning.pytorch.utilities.exceptions import MisconfigurationException, _TunerExitException

/usr/local/lib/python3.11/dist-packages/lightning/pytorch/callbacks/callback.py in <module>
     20 
     21 import lightning.pytorch as pl
---> 22 from lightning.pytorch.utilities.types import STEP_OUTPUT
     23 
     24 

/usr/local/lib/python3.11/dist-packages/lightning/pytorch/utilities/types.py in <module>
     34 from torch.optim import Optimizer
     35 from torch.optim.lr_scheduler import LRScheduler, ReduceLROnPlateau
---> 36 from torchmetrics import Metric
     37 from typing_extensions import NotRequired, Required
     38 

/usr/local/lib/python3.11/dist-packages/torchmetrics/__init__.py in <module>
     20         PIL.PILLOW_VERSION = PIL.__version__
     21 
---> 22 from torchmetrics import functional  # noqa: E402
     23 from torchmetrics.aggregation import (  # noqa: E402
     24     CatMetric,

/usr/local/lib/python3.11/dist-packages/torchmetrics/functional/__init__.py in <module>
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from torchmetrics.functional.audio._deprecated import _permutation_invariant_training as permutation_invariant_training
     15 from torchmetrics.functional.audio._deprecated import _pit_permutate as pit_permutate
     16 from torchmetrics.functional.audio._deprecated import (

/usr/local/lib/python3.11/dist-packages/torchmetrics/functional/audio/__init__.py in <module>
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from torchmetrics.functional.audio.pit import permutation_invariant_training, pit_permutate
     15 from torchmetrics.functional.audio.sdr import (
     16     scale_invariant_signal_distortion_ratio,

/usr/local/lib/python3.11/dist-packages/torchmetrics/functional/audio/pit.py in <module>
     20 from typing_extensions import Literal
     21 
---> 22 from torchmetrics.utilities import rank_zero_warn
     23 from torchmetrics.utilities.imports import _SCIPY_AVAILABLE
     24 

/usr/local/lib/python3.11/dist-packages/torchmetrics/utilities/__init__.py in <module>
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from torchmetrics.utilities.checks import check_forward_full_state_property
     15 from torchmetrics.utilities.data import (
     16     dim_zero_cat,

/usr/local/lib/python3.11/dist-packages/torchmetrics/utilities/checks.py in <module>
     23 from torch import Tensor
     24 
---> 25 from torchmetrics.metric import Metric
     26 from torchmetrics.utilities.data import select_topk, to_onehot
     27 from torchmetrics.utilities.enums import DataType

/usr/local/lib/python3.11/dist-packages/torchmetrics/metric.py in <module>
     28 from torch.nn import Module
     29 
---> 30 from torchmetrics.utilities.data import (
     31     _flatten,
     32     _squeeze_if_scalar,

/usr/local/lib/python3.11/dist-packages/torchmetrics/utilities/data.py in <module>
     20 
     21 from torchmetrics.utilities.exceptions import TorchMetricsUserWarning
---> 22 from torchmetrics.utilities.imports import _TORCH_GREATER_EQUAL_1_12, _XLA_AVAILABLE
     23 from torchmetrics.utilities.prints import rank_zero_warn
     24 

/usr/local/lib/python3.11/dist-packages/torchmetrics/utilities/imports.py in <module>
     52 _GAMMATONE_AVAILABLE: bool = package_available("gammatone")
     53 _TORCHAUDIO_AVAILABLE: bool = package_available("torchaudio")
---> 54 _TORCHAUDIO_GREATER_EQUAL_0_10: Optional[bool] = compare_version("torchaudio", operator.ge, "0.10.0")
     55 _SACREBLEU_AVAILABLE: bool = package_available("sacrebleu")
     56 _REGEX_AVAILABLE: bool = package_available("regex")

/usr/local/lib/python3.11/dist-packages/lightning_utilities/core/imports.py in compare_version(package, op, version, use_base_version)
     76     """
     77     try:
---> 78         pkg = importlib.import_module(package)
     79     except ImportError:
     80         return False

/usr/lib/python3.11/importlib/__init__.py in import_module(name, package)
    124                 break
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 
    128 

/usr/local/lib/python3.11/dist-packages/torchaudio/__init__.py in <module>
      1 # Initialize extension and backend first
----> 2 from . import _extension  # noqa  # usort: skip
      3 from ._backend import (  # noqa  # usort: skip
      4     AudioMetaData,
      5     get_audio_backend,

/usr/local/lib/python3.11/dist-packages/torchaudio/_extension/__init__.py in <module>
     36 _IS_ALIGN_AVAILABLE = False
     37 if _IS_TORCHAUDIO_EXT_AVAILABLE:
---> 38     _load_lib("libtorchaudio")
     39 
     40     import torchaudio.lib._torchaudio  # noqa

/usr/local/lib/python3.11/dist-packages/torchaudio/_extension/utils.py in _load_lib(lib)
     58     if not path.exists():
     59         return False
---> 60     torch.ops.load_library(path)
     61     return True
     62 

/usr/local/lib/python3.11/dist-packages/torch/_ops.py in load_library(self, path)
   1348             # static (global) initialization code in order to register custom
   1349             # operators with the JIT.
-> 1350             ctypes.CDLL(path)
   1351         self.loaded_libraries.add(path)
   1352 

/usr/lib/python3.11/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    374 
    375         if handle is None:
--> 376             self._handle = _dlopen(self._name, mode)
    377         else:
    378             self._handle = handle

OSError: /usr/local/lib/python3.11/dist-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZN2at4_ops9fft_irfft4callERKNS_6TensorESt8optionalIN3c106SymIntEElS5_ISt17basic_string_viewIcSt11char_traitsIcEEE

scores = predictor.evaluate(test_data, metrics=["roc_auc"])
scores

predictions = predictor.predict(test_data.drop(columns=label_col))
print(predictions[:5])

print(test_data[label_col][:5])

probas = predictor.predict_proba(test_data.drop(columns=label_col))
probas[:5]