GitXplorerGitXplorer
t

opcm

public
3 stars
0 forks
0 issues

Commits

List of commits on branch main.
Verified
3cad5c4cb7ff4e95d46d446550ce8f863c58af41

Update README.md

ttanganke committed 2 days ago
Unverified
a2f1cd514dec0b89bbcbcfb4e5e4c6e5c32e55ad

delete test files

ttanganke committed 2 days ago
Unverified
6a62bb0d8540e816cceafae3a71be915d194f209

update README.md

ttanganke committed 2 days ago
Unverified
51a1d5c2c6555798a20fe365ce690ca25090a6e2

add OPCM for continual model merging

ttanganke committed 10 days ago
Unverified
7cf9f236401c76aacac87bd5561833a0885a9d15

add continual task arithmetic

ttanganke committed 17 days ago
Unverified
545a07c7ea4f9cbaefc99a1cd6d596419349df55

update

ttanganke committed 19 days ago

README

The README file for this repository.

Merging Models on the Fly Without Retraining:
A Sequential Approach to Scalable Continual Model Merging

arXiv

Abstract

Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their specialized capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight interpolation-based methods being the predominant approaches. However, these conventional approaches are not well-suited for scenarios where models become available sequentially, and they often suffer from high memory requirements and potential interference between tasks. In this study, we propose a training-free projection-based continual merging method that processes models sequentially through orthogonal projections of weight matrices and adaptive scaling mechanisms. Our method operates by projecting new parameter updates onto subspaces orthogonal to existing merged parameter updates while using an adaptive scaling mechanism to maintain stable parameter distances, enabling efficient sequential integration of task-specific knowledge. Our approach maintains constant memory complexity to the number of models, minimizes interference between tasks through orthogonal projections, and retains the performance of previously merged models through adaptive task vector scaling. Extensive experiments on CLIP-ViT models demonstrate that our method achieves a 5-8% average accuracy improvement while maintaining robust performance in different task orderings.

Installation

install the latest version in development

pip install -e . # install the package in editable mode

Project Structure

The project is structured as follows:

  • fusion_bench/: the main package of the benchmark.
    • method: contains the implementation of the fusion methods.

      naming convention: fusion_bench/method/{method_name}/{variant}.py contains the implementation of the specific method or its variants. For example, fusion_bench/method/regmean/clip_regmean.py contains the implementation of the RegMean algorithm for CLIP vision models.

    • modelpool: contains the implementation of the model pool, responsible for managing the models and dataset to be loaded.
    • taskpool: contains the implementation of the task pool, responsible for evaluating the performance of models returned by the algorithm.
  • config/: configuration files for the benchmark. We use Hydra to manage the configurations.
    • method: configuration files for the fusion methods.

      naming convention: config/method/{method_name}/{variant}.yaml contains the configuration for the specific method or its variants.

    • modelpool: configuration files for the model pool.
    • taskpool: configuration files for the task pool.
    • model: configuration files for the models.
    • dataset: configuration files for the datasets.
  • examples/: example scripts for running some of the experiments.

    naming convention: examples/{method_name}/ contains the files such as bash scripts and jupyter notebooks for the specific method.

  • tests/: unit tests for the benchmark.

How to run the experiments

The experiments are conducted on the CLIP-ViT models. The bash scripts to run the experiments are in the examples/opcm folder.