Roadmap

This is a high level list of what we are working on and what is completed.

Legend

✅	completed	🕥	In progress	⬜	Planned, not started

Work in progress

(see Completed features below)

Packages and dependencies

🕥 Docker images base (as part of: https://github.com/pombredanne/conan ) #651
🕥 RubyGems base and dependencies #650 (code in https://github.com/nexB/scancode-toolkit-contrib/ )
🕥 Perl, CPAN (basic in https://github.com/nexB/scancode-toolkit-contrib/)
🕥 Go : parsing for Godep in https://github.com/nexB/scancode-toolkit-contrib/
🕥 Windows PE #652
⬜ RPMs dependencies #649
⬜ Windows Nuget dependencies #648
⬜ Bower packages #654
⬜ Python dependencies #653
⬜ CRAN
✅ Plain packages
⬜ other Java-related meta files (SBT, Ivy, Gradle, etc.)
⬜ Debian debs
⬜ other JavaScript (jspm, etc.)
⬜ other Linux distro packages

License detection

🕥 support and detect license expressions (code in https://github.com/nexB/license-expression/ )
🕥 support and detect composite licenses
⬜ support custom licenses
⬜ move licenses data set to external separate repository
⬜ Improved unknown license detection
🕥 sync with external sources (DejaCode, SPDX, etc.)

Copyrights

⬜ speed up copyright detection
⬜ improved detected lines range
⬜ streamline grammar of copyright parser
⬜ normalize holders and authors for summarization
⬜ normalize and streamline results data format

Core features

✅ pre scan filtering (ignore binaries, etc)
✅ plugins! (worked as part of the GSoC by @yadsharaf )
🕥 support Python 3 #295
🕥 transparent archive extraction (as opposed to on-demand with extractcode)
🕥 .scancode configuration file for exclusions, defaults, scan failure conditions, etc.
⬜ support scan pipelines and rules to organize more complex scans
⬜ scan baselining, delta scan and failure conditions (such as license change, etc)
⬜ dedupe and similarities to avoid re-scanning. For now only identical files are scanned only once.
⬜ Improved logging
🕥 native support for ABC Data (See https://github.com/nexB/aboutcode/blob/master/aboutcode-data/README.rst )

Classification, summarization and deduction

🕥 File classification #426
⬜ summarize and aggregate data #377

Source code support

🕥 symbols : parsing complete in https://github.com/nexB/scancode-toolkit-contrib/
🕥 metrics : some elements in https://github.com/nexB/scancode-toolkit-contrib/

Compiled code support

🕥 ELFs : parsing complete in https://github.com/nexB/scancode-toolkit-contrib/
🕥 Java byte code : parsing complete in https://github.com/nexB/scancode-toolkit-contrib/
🕥 Windows PE : parsing complete in https://github.com/nexB/scancode-toolkit-contrib/
🕥 Mach-O : parsing complete in in https://github.com/nexB/scancode-toolkit-contrib/
⬜ Dalvik/dex

Data exchange

⬜ SPDX data conversion #338

Packaging

⬜ simpler installation, automated installer
⬜ distro-friendly packaging
⬜ unbundle and package as multiple libaries (commoncode, extractcode, etc)

Documentation

⬜ integration in a build/CI loop
⬜ end to end guide to analyze a codebase
⬜ hacking guides
⬜ API doc when using ScanCode as a library

CI integration

⬜ Plugins for CI (Jenkins, etc)
⬜ Integration for CI (Travis, Appveyor, Drone, etc)

Other work in progress

ScanCode server: Spawned as its own project: https://github.com/nexB/scancode-server . Will include Integration / webhooks for Github, Bitbucket.
VulnerableCode: NVD and CVE lookups: Spawned as its own project: https://github.com/nexB/vulnerablecode
AboutCode manager: desktop app for scan review: Spawned as its own project: https://github.com/nexB/aboutcode-manager
DependentCode: dynamic dependencies resolutions: Spawned as its own project: https://github.com/nexB/dependentcode

Package mining and matching

(Note that this will be spawned in its project) Some code is in https://github.com/nexB/scancode-toolkit-contrib/

🕥 exact matching
🕥 attribute-based matching
🕥 fuzzy matching
⬜ peer-reviewed meta packages repo
⬜ basic mining of package repositories

Other

⬜ Crypto code detection

Completed features

Core scans

✅ exact license detection
✅ approximate license detection
✅ copyright detection
✅ file information (size, type, etc.)
✅ URLs, emails, authors

Ouputs and UI

✅ JSON compact and pretty
✅ plain HTML tables, also usable in a spreadsheet
✅ fancy HTML 'app' with a file tree navigation, and scan results filtering, search and sorting
✅ improved scans GUI now its own project: https://github.com/nexB/aboutcode-manager
✅ simple scan summary
✅ SPDX output

Package and dependencies

✅ common model for packages data
✅ basic support for common packages format
✅ RPM packages base
✅ NuGet packages base
✅ Python packages base
✅ PHP Composer packages support with dependencies
✅ Java Maven POM packages support with dependencies
✅ npm packages support with dependencies

Speed!

✅ accelerate license detection indexing and scanning; include caching
✅ scan using multiple processes to speed up overall scan
✅ cache per-file scan to disk and stream final results

Other

✅ archive extraction with extractcode
✅ conversion of scan results to CSV
✅ improved error handling, verbose and diagnostic output

See http://nexb.com for more.

Uh oh!

Roadmap

Legend

Work in progress

Packages and dependencies

License detection

Copyrights

Core features

Classification, summarization and deduction

Source code support

Compiled code support

Data exchange

Packaging

Documentation

CI integration

Other work in progress

Package mining and matching

Other

Completed features

Core scans

Ouputs and UI

Package and dependencies

Speed!

Other

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally