FOSSology Backup and Restore Scope ===== Implement and document backup and restore ===== Fully implement and document how to backup and restore a running fossology system including database, repository, and any necessary system configuration specific to fossology. Will provide 2 solutions to backup and restore the repository: - Backup and restore entire repository - Backup and restore only gold files repository ==== Backup and restore entire repository solution ==== Provide user a instructions about backup and restore entire repository solution. === Backup solution === 1. Stop the Scheduler before backup(and verify that all the agents have stopped) 2. Backup the postgresql database * Implement a backup script: //pg_dumpall | gzip > backup_filename//, and this is included in a cron job; * The //backup_filename// are copied daily to a backup server using rsync (or other tools) 3. backup entire repository data to a backup server using rsync, include gold, files, license directory 4. Start the Scheduler after finished all backup === Restore solution === 1. Restore the postgresql database 2. Restore the entire repository 3. Restart Scheduler ** Notes:** Suggest user if you have enough disk space, we recommend user to use this backup solution. And I suggest we also list approximately backup and restore time will cost in the instructions, in order to give user to tradeoff. ==== Backup and restore only Gold files solution ==== We also provide a a solution that involves the backup and restore only Gold files, and the database. This is a good solution if user's don’t have enough disk space or don’t want to backup entire repo. === Backup solution === 1. Stop the Scheduler before backup(and verify that all the agents have stopped) 2. Backup the postgresql database * Implement a backup script: //pg_dumpall | gzip > backup_filename//, and this is included in a cron job; * The //backup_filename// are copied daily to a backup server using rsync (or other tools) 3. backup only repository gold and license directory 4. Start the Scheduler after finished all backup === Restore solution === 1. Restore the postgresql database 2. Restore only repository gold and license directory and don't do unpack in restore process 3. Restart Scheduler 4. Give a user interface if user want to reunpack the gold files === Code changes to implement only backup gold files solution === * unpack agent code changes: requires a switch so that it can unpack to a repository but not update the db. * UI code changes when browse the files which the files not in repository: any place a file is retrieved from the repository needs to check to make sure the file exists. If it does not, check to make sure the gold file exists. If it does, possibly ask the user if they want to recover from the gold, and if so, queue up a job to do the ununpack. * if our only gold files backup solution take place at the time point that license analysis job(or other agents need the unpack files)in process, when restore all agents need code changes to satisfy cannot find the unpack files: queue up a job to do the ununpack or other solution(need discuss) ==== Reduce the size of the repository ==== To select which unpacked files we should save and which unpacked files we should not save but now in the repository, removing the unpacked files should not save in repository, implement this in backup scope. ==== Backup and restore the necessary system configuration ==== Open question: Which configuration files need to be backed up? Don’t consider backup the system configuration files, only adding Notes in the backup and restore procedures document. FOSSology needs to backup are Scheduler.conf, Host.conf, Db.conf, RepPath.conf **Question**: The configuration file detail backup requirements, when to backup the configuration file (same time with database and repository backup)?

Answer bobg Aug-3-09: If these files are lost, the user should recover them from their normal system backup. If they are lost due to a system failure, they have bigger problems than restoring fossology. ===== Build any tools to support backup ===== Design, Review, Build, test, and document any tools, agents, or plugins that are necessary to enable the backup and restore process documented in #1 * for example it may be necessary to **create a tool that can validate the contents of a fossology repository compared with the contents of the metadata database**, or report where there are inconsistencies; this tool would be necessary, for example, if the backup strategy only backed up the "Gold" files, instead of the entire unpacked repository * to **create a tool can manually implement the backup and restore process** in the beginning. Long term consider if it should be automated when a disaster happens.

bobg Aug-3-09:

  1. stop scheduler
  2. pg_dumpall and save backup
  3. backup repository
  4. start scheduler
Does it need to me more complex than this? ===== Deploy backup strategy ===== Implement the proposed only backup and restore gold files strategy in the two running FOSSology production systems (external and internal systems) **Question**: What’s production systems configuration and deployment, should further understand the infrastructure of production system? {{:task:fossology_deployment.jpg|}} I drew a picture of the external FOSSology Production system deployment, please review the diagram and add comments. ===== Test disaster recovery ===== * Perform a disaster recovery test on one or both production installations to validate the safety of the production systems. * **disaster recovery**: The server is in the midst of a job and the plug is pulled; can we recover? * **Flat out loose everything** - the whole data center blows up, no more disks, no more nothing * blow away **just the database** * **loose an agent and its storage**. then what happens? **Question:** What’s the relationship between an agent and its storage? Is the storage in agent's local disk or network file system? When loose the agent, how should the FOSSology cluster react? **Note:** This brings up some very important questions: - How do you do multi-system backups? It will be different from single system backups because each agent has unique data stored on its local storage that must be backed up. - If an agent machine fails, how does the running FOSSology system respond? (i.e., notify the user? Try to work around the failure?) - If you restore an agent machine that has failed, how can you verify that its local storage is consistent with the rest of the repository? ===== Backup and Restore multi-system repository ===== Mulit-system usual deploy method: Distributed agents and distributed repository * Backup agents’ repository, all agents’ repository mounted as one whole repository, from this mount point backup the whole repository * Restore the agents’ repository, just restore the repository, and didn’t consider restore to which agent local storage now. (Should consider in the further) ===== Old notes ===== The following is an old conversation about backing up an internal machine. - Currently, there is a 4 day rotating archive of the db on rfo. should it be longer? - Start "off-site" backups on rfo (similar to what is currently done with fossbazaar & fossology); the archives are stored locally on rfo. Advice is needed (from Matt?) on how to do proper backups. - Define & test the recovery process. * database recovery (Mary) Additional notes from 11/13 meeting: * We are dumping database on ffhc, rfo, but a) need to verify it's getting backed up off-system * We also need at a minimum to backup the gold files, and then we need to test the recovery process... * what's with the license/ directory in the repo? just temporary storage, not critical to recovery * Tests - disaster recovery: The server is in the midst of a job (which jobs?) and the plug is pulled; can we recover? - Flat out loose everything - the whole data center blows up, no more disks, no more nothing - blow away just the database - loose an agent and its storage. then what happens? //Notes on **document procedure to backup the fossology metadata** from 2008.05.6 IRC dicussion// BTW, do we have an easy way to let users back up their FOSSology database? or is that documented anywhere? danger: no, I raised this issue 6+ months ago danger: that's in the postgres docs danger: since we need it ourselves, we're not doing backups of fossology yet bobg: I know Postgres has a way to let you back things up, but it would probably be a good idea to summarize the fossology specifics bobg: and somewhere down the road (a long ways) create a "Back up my FOSSology data" menu item taggart: ack. taggart: we now have a way to capture this :) sorry for the delay I had proposed having fossology automatically dump state to the filesystem, so that a normal filesystem backup could grab those snapshots by that I mean the fossology postgresql db I think the repo and golden area, etc should be fine with a normal filesystem backup **Update 7/17 danger** Postgres backups are now in force on our repos. This is accomplished with a simple pg_dumpall | gzip > backup_filename put into a cron job on the repository system. For gold files, in any large system there are a lot of gold files. it would be advantageous to only store the source URL for all gold files that have it set. Only backup the physical gold files for those that have no source URL. Note this has not been implemented yet. it would require querying the database to find out where gold files came from.