Skip to content

Roadmap

Michael Herzog edited this page Feb 19, 2019 · 25 revisions

This is a high level list of what we are working on and what is completed.

Legend

completed 🕥 In progress Planned, not started

Work in progress

(see Completed features below)

Packages manifests and dependencies parsers

License detection

  • ✅ support and detect license expressions (code in https://github.com/nexB/license-expression/ )
  • 🕥 support and detect composite licenses
  • ⬜ support custom licenses
  • ⬜ move licenses data set to external separate repository
  • ✅ Improved unknown license detection
  • ✅ sync with external sources (DejaCode, SPDX, etc.)

Copyrights

  • ✅ speed up copyright detection
  • ✅ improved detected lines range
  • ✅ streamline grammar of copyright parser
  • ✅ normalize holders and authors for summarization
  • ✅ normalize and streamline results data format

Core features

  • ✅ pre scan filtering (ignore binaries, etc)
  • ✅ pre/post/ouput plugins! (worked as part of the GSoC by @yadsharaf )
  • ✅ scan plugins (e.g. plugins that run a scan to collect data)
  • 🕥 support Python 3 #295
  • 🕥 transparent archive extraction (as opposed to on-demand with extractcode)
  • 🕥 scancode.yml configuration file for exclusions, defaults, scan failure conditions, etc.
  • ⬜ support scan pipelines and rules to organize more complex scans
  • ⬜ scan baselining, delta scan and failure conditions (such as license change, etc) (will be spawned as its own DeltaCode project)
  • ⬜ dedupe and similarities to avoid re-scanning. For now only identical files are scanned only once.
  • ⬜ Improved logging, tracing and error diagnostics
  • 🕥 native support for ABC Data (See https://github.com/nexB/aboutcode/blob/master/aboutcode-data/README.rst )

Classification, summarization and deduction

  • ✅ File classification #426
  • ✅ summarize and aggregate data #377 at the top level

Source code support (some will be spawned as their own tool)

Compiled code support (will be spawned as their own tool)

Data exchange

  • ✅ SPDX data conversion #338

Packaging

  • ⬜ simpler installation, automated installer
  • ✅ distro-friendly packaging
  • ⬜ unbundle and package as multiple libaries (commoncode, extractcode, etc)

Documentation

  • ⬜ integration in a build/CI loop
  • ⬜ end to end guide to analyze a codebase
  • ⬜ hacking guides
  • ⬜ API doc when using ScanCode as a library

CI integration

  • ⬜ Plugins for CI (Jenkins, etc)
  • ⬜ Integration for CI (Travis, Appveyor, Drone, etc)

Other work in progress

Package mining and matching

(Note that this will be spawned in its project) Some code is in https://github.com/nexB/scancode-toolkit-contrib/

  • 🕥 exact matching
  • 🕥 attribute-based matching
  • 🕥 fuzzy matching
  • ⬜ peer-reviewed meta packages repo
  • ⬜ basic mining of package repositories

Other

  • ⬜ Crypto code detection

Completed features

Core scans

  • ✅ exact license detection
  • ✅ approximate license detection
  • ✅ copyright detection
  • ✅ file information (size, type, etc.)
  • ✅ URLs, emails, authors

Ouputs and UI

  • ✅ JSON compact and pretty
  • ✅ plain HTML tables, also usable in a spreadsheet
  • ✅ fancy HTML 'app' with a file tree navigation, and scan results filtering, search and sorting
  • ✅ improved scans GUI now its own project: https://github.com/nexB/aboutcode-manager
  • ✅ simple scan summary
  • ✅ SPDX output

Package and dependencies

  • ✅ common model for packages data
  • ✅ basic support for common packages format
  • ✅ RPM packages base
  • ✅ NuGet packages base
  • ✅ Python packages base
  • ✅ PHP Composer packages support with dependencies
  • ✅ Java Maven POM packages support with dependencies
  • ✅ npm packages support with dependencies

Speed!

  • ✅ accelerate license detection indexing and scanning; include caching
  • ✅ scan using multiple processes to speed up overall scan
  • ✅ cache per-file scan to disk and stream final results

Other

  • ✅ archive extraction with extractcode
  • ✅ conversion of scan results to CSV
  • ✅ improved error handling, verbose and diagnostic output
Clone this wiki locally