GitXplorerGitXplorer
t

opcm

public
3 stars
0 forks
0 issues

Commits

List of commits on branch main.
Unverified
51de70e85d4a77a9e7701b9398e4e4c3357c7ace

update utils

ttanganke committed 24 days ago
Unverified
76d66a161315f4b7eda723a8d43b9a34dab948d8

add docstring

ttanganke committed a month ago
Unverified
dc08a38b549af98d7cdbd5f5b9bebef49a11fab3

add config/model/clip-vit/README.md

ttanganke committed a month ago
Unverified
3aacb833cc1f8a9d6c96de5bcc4d0ca74e20ed12

simplify model config for CLIPVisionModelPool

ttanganke committed a month ago
Unverified
b791eab9942e61b1854aa4bddcc064a5e830f3f3

update modelpool.py

ttanganke committed a month ago
Unverified
e87f727809d42b0c2c52aa5dbf047aa5a775885c

Merge branch 'main' of https://github.com/tanganke/svd_projection into main

ttanganke committed a month ago

README

The README file for this repository.

Merging Models on the Fly Without Retraining:
A Sequential Approach to Scalable Continual Model Merging

arXiv

Abstract

Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their specialized capabilities across different tasks and domains. Current model merging techniques focus on merging all available models simultaneously, with weight interpolation-based methods being the predominant approaches. However, these conventional approaches are not well-suited for scenarios where models become available sequentially, and they often suffer from high memory requirements and potential interference between tasks. In this study, we propose a training-free projection-based continual merging method that processes models sequentially through orthogonal projections of weight matrices and adaptive scaling mechanisms. Our method operates by projecting new parameter updates onto subspaces orthogonal to existing merged parameter updates while using an adaptive scaling mechanism to maintain stable parameter distances, enabling efficient sequential integration of task-specific knowledge. Our approach maintains constant memory complexity to the number of models, minimizes interference between tasks through orthogonal projections, and retains the performance of previously merged models through adaptive task vector scaling. Extensive experiments on CLIP-ViT models demonstrate that our method achieves a 5-8% average accuracy improvement while maintaining robust performance in different task orderings.

Installation

install the latest version in development

pip install -e . # install the package in editable mode

Project Structure

The project is structured as follows:

  • fusion_bench/: the main package of the benchmark.
    • method: contains the implementation of the fusion methods.

      naming convention: fusion_bench/method/{method_name}/{variant}.py contains the implementation of the specific method or its variants. For example, fusion_bench/method/regmean/clip_regmean.py contains the implementation of the RegMean algorithm for CLIP vision models.

    • modelpool: contains the implementation of the model pool, responsible for managing the models and dataset to be loaded.
    • taskpool: contains the implementation of the task pool, responsible for evaluating the performance of models returned by the algorithm.
  • config/: configuration files for the benchmark. We use Hydra to manage the configurations.
    • method: configuration files for the fusion methods.

      naming convention: config/method/{method_name}/{variant}.yaml contains the configuration for the specific method or its variants.

    • modelpool: configuration files for the model pool.
    • taskpool: configuration files for the task pool.
    • model: configuration files for the models.
    • dataset: configuration files for the datasets.
  • examples/: example scripts for running some of the experiments.

    naming convention: examples/{method_name}/ contains the files such as bash scripts and jupyter notebooks for the specific method.

  • tests/: unit tests for the benchmark.

How to run the experiments

The experiments are conducted on the CLIP-ViT models. The bash scripts to run the experiments are in the examples/opcm folder.