The package metadata agent puts data about each package (rpm and deb) into the database. This data is used to answer the following questions:
What is the declared license(s) for a package?
Declared licenses appear in COPYRIGHT, LICENSE, COPYING files as well as RPM .
spec files.
Is uploadtree_pk a package? This is fundamental to package based reports. Including:
-
-
Given a name and optional version, what are the matching packages?
Returns: 0-n uploadtree_pk's.
Name could be exact or substring match
Needed for REST
API. Currently used by sniffer.
Is the package (uploadtree_pk) binary or source?
-
What do we know about a package?
-
-
-
-
Needed for: many queries based on this package info (including a general query by form)
1. Package Agent provides follow data, and store these data into database
Pkg Metadata (rpm and debian package)
RPM Metadata; Use an rpm library or other methods to instead of fork an rpm –q for each file
Debian Metadata
Declared licensn(s) for a package
Package agent should use the results of nomos license analysis these files(COPYRIGHT, LICENSE, COPYING files as well as RPM .
spec files) to get the declared licenses
tasks: 1. Create new package table to store Pkg Metadata **(1~2 days)**
2. Package Agent call Spec agent to put rpm metadata into package table **(3 days)**
3. Package Agent call Debian agent to put debian metadata into package table **(4~5 days)**
2. Need for Package browser, Package license history. Package Agent should answer follow questions
Is uploadtree_pk a package? mimetype agent can answer this question
Is uploadtree_pk binary or source?
What are the package dependencies?
tasks: 1. Identify an uploadtree_pk as a package(rpm or debian) in the database
2. identify an uploadtree_pk as a binary or source package in the database **(2 days)**
3. Package info report
task: Package info report web ui **(2 days)**
Package Info report
Note: Should decide what data we save and what we ignore.
4. Given a name and optional version, what are the matching packages?
Given name and optional version, search the package meta data in the database, find out all matching packages, then return all match uploadtree_pks
Input: name and optional version
Output:Return 0-n uploadtree_pk, which match input name and version **(4 days)**
Currently the specagent depends on the mimetype agent to identify 'application/x-rpm-spec'. Then it processes all those pfiles for a given upload by forking an rpm -q for each file. It keeps track of which files have been processed by a “Processed” attrib record.
We should use an rpm library (see rpm5.org) rather than fork an rpm -q for each file.
1.2 uses agent_runstatus together with the data in the pkg table, rather than setting attribs on each file.
specagent is missing “Description”
Should we have separate debian and rpm package agents or one master agent? One master agent might be more efficient. One master agent also seems simplier for the case when a single upload contains both debian and rpm packages.
Should metadata keywords be munged to a common name? For example:
debian “Package” = RPM “Name”
debian “Section” = RPM “Group”
debian “Architecture” = RPM (have to get arch from filename)
debian “Depends” = RPM “Requires”
I think I'd lean to keeping the raw data
and not trying to combine meanings of these keyword fields.