least common subsequence matching in audio fingerprints to remove ads and intros from your podcasts
Vaguely working! Matching isn't very good, but two passes at 30% similarity and minimum 10 second spans has yielded good results for two inputs.
- open
audio-adblock.go
and add the filenames of the two inputs -
go run audio-adblock.go
will read them in and start processing - data will be fingerprinted, and matching sections will be removed
- outputs will be written to
A.mp3
andB.mp3
, as hardcoded inaudio-adblock.go
- [ ] retain matched fingerprints in memory
- [ ] serialize fingerprints for storage
- [ ] serialize associated audio snippet for storage
- [ ] allow more than two inputs for fingerprinting, limit to one output
- [ ] add a halfhearted CLI
- [ ] switch to a multithreaded LAME encoder, or ditch LAME entirely