Dataflows

Mapping out how data flows from place to place

Introduction

Dataflow diagrams show how data gets copied from system to system, over time. The different places where data can be stored are laid out from top to bottom, and the sequence of events the data can go through are plotted from left to right. The lines of the 'tube map' or 'metro' layout show how the data moves, and you can inspect the event 'stations' to get more information about each event.

Using OAIS As An Example

The dataflow diagram below shows simplified version of how the Open Archival Information System describes the flow of data through an archive:

Dataflows are defined using a text format that describes the sequence of events:

start sip@producer
move sip@producer sip@store "Ingest SIP"
transform sip@store aip@store "SIP to AIP"
space
derive aip@store dip@store "Generate DIP"@N [0,1]
move dip@store dip@consumer "Access"
end

The text box below shows the full source for the diagram above, which includes more definitions, details and comments. If you edit it, the diagram above will automatically update, so you can see what happens.

OAIS In Detail

TBA: Dataflow diagrams can also be used to explore the OAIS model in more detail.

Real Dataflows

A number of well-established and long-running digital preservation services have been studied as part of the Registries of Good Practice project. There are a wide range of successful approaches and strategies, and here we use dataflow diagrams as a way to document what real-life digital preservation looks like.

Dataflow Definition Language

TBA: A more detailed definition of the DFL language, for reference purposes.