GitXplorerGitXplorer
m

MassQueryLanguage

public
39 stars
8 forks
53 issues

Commits

List of commits on branch master.
Unverified
21353acbab1afc321a2141460e9899dab1c69bac

bug fix to make sure the caches are not in one big flat file

mmwang87 committed a year ago
Verified
63cbbe7d02421502a063b88401f6874354d3ede2

Merge pull request #242 from mwang87/file-size-symlinks

mmwang87 committed a year ago
Unverified
fb933192e2680d8577c0a31cccd8c01eea4b6812

bug fix to make sure when symlinks its also ok

mmwang87 committed a year ago
Unverified
77b30de2da841bf06887cb2f5066f5fbc44571ed

using actual function for finding path for cached file

mmwang87 committed a year ago
Unverified
a1b8eee2afce40dfb7b781b88254a03a3741c1e3

rename bug fix

mmwang87 committed a year ago
Unverified
fb05e5d3fda1164f8760d4d23a8bff0e2513bd5c

making sure all the caching parameters are passed through

mmwang87 committed a year ago

README

The README file for this repository.

Mass Spec Query Language

Unit Testing NF Workflow Testing

The Mass Spec Query Language (MassQL) is a domain specific language meant to be a succinct way to express a query in a mass spectrometry centric fashion. It is inspired by SQL, but it attempts to bake in assumptions of mass spectrometry to make querying much more natural for mass spectrometry users. Broadly we attempt to design it according to several principles:

  1. Expressiveness - Capture complex mass spectrometry patterns that the community would like to look for
  2. Precision - Exactly prescribe how to find data without ambiguity
  3. Scalable - Easily facilitating the querying of one spectrum all the way up to entire repositories of data
  4. Relatively Natural - MassQL should be relatively easy to read and write and even use to communicate ideas about mass spectrometry, you know like a language.

Repository Structure

This is the repository to define the language and reference implementation. This contains several parts

  1. Language Grammar
  2. Reference Implementation Python API
  3. Command line Utility to execute
  4. NextFlow Workflow For Large Scale Analysis
  5. ProteoSAFe workflow
  6. Dash interactive exploration

Developers/Contact

Mingxun Wang is the main creator and developer of MassQL. Contact me for contributing or using it!

Language Specification/Documentation

Checkout specifics for the language, examples, and design patterns at the documentation.

Documentation Link

Python API

To install massql

pip install massql

Here is the most basic operation you can do

from massql import msql_engine

results_df = msql_engine.process_query(input_query, input_filename)

If you want to push in a data frame you already have, you can specify it

from massql import msql_engine
from massql import msql_fileloading

# Loading Data
ms1_df, ms2_df = msql_fileloading.load_data(input_filename)

# Executing Query
results_df = msql_engine.process_query(input_query, input_filename, ms1_df=ms1_df, ms2_df=ms2_df)

Command Line Tool

You can use the command line tool massql to query things or put things into a pipeline.

A few examples of what you can do

massql test.mzML "QUERY scaninfo(MS2DATA)" --output_file results.tsv

Web API

API Version

/api

Parsing query into intermediate JSON string

/parse?query=<query string>

Example Link

Visualization of Query

Visualization image of MS1 spectra

/visualize/ms1

Example Link

Visualization image of MS2 spectra

/visualize/ms2

Testing

To run tests, you'll need to first fetch some fixtures that are not bundled with the git repo: cd tests && sh ./get_data.sh

You will also want to install the extra requirements for the test suite: pip install -r requirements_test.txt

License

MIT License