FOSSology Project Logo FOSSology
Advancing open source analysis and development
 

1.2 Package Metadata Agent

The package metadata agent puts data about each package (rpm and deb) into the database. This data is used to answer the following questions:

  1. What is the declared license(s) for a package?
    1. Declared licenses appear in COPYRIGHT, LICENSE, COPYING files as well as RPM .spec files.
  2. Is uploadtree_pk a package? This is fundamental to package based reports. Including:
  3. Given a name and optional version, what are the matching packages?
    1. Returns: 0-n uploadtree_pk's.
    2. Name could be exact or substring match
    3. Needed for REST API. Currently used by sniffer.
  4. Is the package (uploadtree_pk) binary or source?
    1. Needed for distro reports
  5. What do we know about a package?
    1. Needed for: distro db
    2. Needed for: many queries based on this package info (including a general query by form)

1.2 Package Metadata Agent Scope and Estimate

1. Package Agent provides follow data, and store these data into database

  1. Pkg Metadata (rpm and debian package)
    1. RPM Metadata; Use an rpm library or other methods to instead of fork an rpm –q for each file
    2. Debian Metadata
  2. Declared licensn(s) for a package
    • Package agent should use the results of nomos license analysis these files(COPYRIGHT, LICENSE, COPYING files as well as RPM .spec files) to get the declared licenses
tasks: 1. Create new package table to store Pkg Metadata                                 **(1~2 days)**
       2. Package Agent call Spec agent to put rpm metadata into package table           **(3 days)**
       3. Package Agent call Debian agent to put debian metadata into package table      **(4~5 days)**

2. Need for Package browser, Package license history. Package Agent should answer follow questions

  1. Is uploadtree_pk a package? mimetype agent can answer this question
  2. Is uploadtree_pk binary or source?
  3. What are the package dependencies?
tasks:   1. Identify an uploadtree_pk as a package(rpm or debian) in the database
         2. identify an uploadtree_pk as a binary or source package in the database         **(2 days)** 

3. Package info report

task: Package info report web ui **(2 days)**

Package Info report

Note: Should decide what data we save and what we ignore.

4. Given a name and optional version, what are the matching packages?

  • Given name and optional version, search the package meta data in the database, find out all matching packages, then return all match uploadtree_pks
Input: name and optional version
Output:Return 0-n uploadtree_pk, which match input name and version  **(4 days)**

specagent

Currently the specagent depends on the mimetype agent to identify 'application/x-rpm-spec'. Then it processes all those pfiles for a given upload by forking an rpm -q for each file. It keeps track of which files have been processed by a “Processed” attrib record.

  1. We should use an rpm library (see rpm5.org) rather than fork an rpm -q for each file.
  2. 1.2 uses agent_runstatus together with the data in the pkg table, rather than setting attribs on each file.
  3. specagent is missing “Description”

Debian package agent

  1. debian_metadata - Explanation of debian control files and metadata.
  2. debian_extended_analysis - Potential application using debian data. Not for v 1.2

Notes

  1. Should we have separate debian and rpm package agents or one master agent? One master agent might be more efficient. One master agent also seems simplier for the case when a single upload contains both debian and rpm packages.
  2. Should metadata keywords be munged to a common name? For example:
    1. debian “Package” = RPM “Name”
    2. debian “Section” = RPM “Group”
    3. debian “Architecture” = RPM (have to get arch from filename)
    4. debian “Depends” = RPM “Requires”
I think I'd lean to keeping the raw data 
and not trying to combine meanings of these keyword fields.
 
1.2pkgagent.txt · Last modified: 2009/12/18 01:18 by vincent

Copyright (C) 2007-2009 Hewlett-Packard Development Company, L.P.
FOSSology Project documentation is licensed under the GNU Free Documentation License Version 1.2
Recent changes RSS feed Valid XHTML 1.0 Valid CSS3 Driven by DokuWiki