An Introduction to the World of Metadata and Models
An Introduction to Metadata for Computer Models
General Model Information
Model Validation
Metadata and Cataloging: Methods & Ideas
Research Papers / Articles / Conference Reports on Metadata
 


An Introduction to Metadata for Computer Models

In 1989, Science Magazine had an article titled "Is it real or is it Crazy?" (Pool, 1989) in which they were introducing a whole new field of science referred to as computer experimentation. At that time only a handful of laboratories across the globe had the "super computers" large enough and /or powerful enough to operate computer models.

Now, nearly 12 years later, computer models can be transferred "over the web" or "burned" onto a plastic disk, and downloaded into a palm-held computer. As technology continues, Iím sure that in the near future weíll look back at today's greatest technology as if were archaic.

This sudden boom in technology has been paralleled by a sudden influx of computer models into the scientific community. Models are being used for research and understanding of everything from Hydrology to Yarn manufacturing, from gold deposits to survival rates in ICUís. With this sudden flux, comes a bit of confusion.

The problem that arises is that, to date, there has been no standard method for one person to communicate with another about the model that they have, and with this breakdown in communication, there lies a breakdown in the ease of sharing knowledge and experience. For this cause, a Computer Model Metadata Standard has been needed.

The driving force behind this effort to develop a computer model metadata standard, is the increasing number of digital libraries, registries, and clearinghouses, and the need (and desire) to be able to catalog computer models in these sources. It is through these sources that the knowledge and experience gained in model technology can be shared and distributed.

The effort of creating a model metadata standard is taking place in the academic arena. The academic community has a vested interest in computer models. Not only are models used both in instruction and research, but also it is through said research that many models are developed. The academic circle will be able to develop standards that will be useful for academia, yet applicable and accepted to those both in government and industry.

What is Metadata?

Quite simply put, Metadata is Data about data. (Clarke, 1999) Metadata is the descriptors of a particular data set or object. I like to use the idea of a painting to describe metadata.

Lets take for example a painting done by Picasso. Lets look at "A portrait of E.M. Walter (meme)" (fig.1) One of the first things we can tell about this painting is that it was painted by Picasso. (The Play-do head gives that one away). What else can be said about this painting?

Well, with a little research we can find out that it was painted on October 21st, 1939. We are also able to find out the title. So we have three "elements" of the painting's "metadata." What else would someone care about?

A picture framer or a gallery would want to know the size, itís 41cm X 33cm. An artist might be interested in how it was done. It was a pencil and oil done on canvas. An art critic or art history student might even be interested in the fact that it reflects the feelings that Picasso had towards the subject. (By the way, they say that it reflects the love that he had for her?) All of these elements describe the picture. If I saw the picture in a gallery or museum, I might be able to deduce all of this without seeing the picture, but what if I needed to catalog this painting? That is where the painting's metadata comes in helpful, if not necessary.

So you want to catalog fashion models?

Well, not exactly. Let me explain what type of "model" weíre referring to.

A model basically consists of three steps. The input of information, the processing of the information and a result form the process.

Now our minds run models all the time. For example, lets think about toast. If I set the toaster to 1 or 2, the toast pops back out in three seconds, and still looks a lot like bread. To resolve the problem, I set the toaster to 8 or 9. As you can imagine, just before the toast pops up, the smoke alarm goes off because what once was a piece of bread, now looks like a moon rock. The information about the toaster settings went into my mind, the data was processed, and I came up with a result to set the toaster at a mid-range.

High school dating is all about mind models. I remember asking a girl out several weekends in a row. The constant excuses that she gave for not being able to go out that weekend were entered into the computer in my mind, the data was processed, and the result that she really didnít want to go out with me was the result. Even my 11 month-old son has begun working with models. He has learned that if he gets close to the lamp, we will say no, and that predictable response or result is something that he can laugh at.

So what is a computer model?

Now instead of these mental processes, a computer model runs on algorithms and differential equations. The data is entered in a variety of forms, a specific process is run on the data, and a result is given out. These computer-based models have found applications in all sorts of fields and studies. Computer models, once referred to as "computer experiments" (Pool, 1989) have a great range of advantages.

First of all, aside from the cost of the computer, these computer models are much less expensive then many actual experiments. If we wanted to test a chemical or element that was quite expensive, and we understood the properties of the chemical, we could develop a computer experiment that would show how the product reacted. This test could be run over and over again changing the variables ever so slightly. The results could be found and compared with each other. We could discover something new that would have never been found due to costs of traditional experiments.

Another advantage of computer models is that they are easier to handle. Letís say that the military wanted to test a big bomb that they have just recently come up with. The have created a few prototypes, but the citizens of Nevada are fed up, and wonít let the military test their bombs there anymore. Nevada being the last place on earth that anyone would try to save, the military is forced into creating a computer model of how this bomb would react. Theyíve been working with bombs for years, and know what will happen with different chemicals and wind patterns. The military creates the model and it is fond to be effective and they get the grant to build 2,000 within the next year. Although the bomb was dangerous, they were successfully able to test it.

Just like in the example of the expensive chemical experiment, we can also change the parameters of the experiment in slight increments and assess the effect. Many times in actual experiments, we are unable to change and monitor the parameters of an experiment. This ability to change things around is another advantage to a computer model.

One of the final advantages of a computer model is that sometimes itís the only way. Lets look at the "global Warming" issue. In an attempt to remain politically neutral, let me just explain how these predictions have been formed. The earth is a huge test tube. It would be easy enough for us to throw more CO2 into the atmosphere, or slow down the ocean circulation patterns, but this might have some nasty effects that would make our lifestyle a bit more difficult if not impossible. For this reason, computer models of the earth have been created. With this type of model, one can change the amounts of CO2, or CFCís, and see what the model results will be. If it werenít for these models, one would only be guessing about what the future might hold.

So what does Picasso and Computer Models have in common?

Although I can think of some computer programs that would change normal portraits into something that looks like Picasso, the two have very little in common. The thing that ties the two is the need to describe something to someone who doesnít know anything about it. Hence, the need for computer model metadata.

Being based around oneís understanding of a natural system, a computer model is basically a collection or compilation on what we know about a natural system. Through sharing this knowledge, someone else can take the next step in developing or applying the understanding of the system to further understanding or applications.

So the question arises, how do we share these models? Today technology has brought us to a point where one can simply look "on line" and "download" a model into their laptop computer. There are "model registries" where one can find a large collection of model pertaining to a particular subject. You can go to books and other publications that are full of model descriptions, but none of the sources use the same "elements" to describe the model.

This causes a problem. Letís say that I wanted to use a model to study the runoff of water from catchment basins. I could go to an online registry and find a number of models that pertain to the subject. I them look in a publication that describes different models, and again I find several catchment basin models. I then have a friend send me what he knows about a model that he heard that a friend of his was using. I now have a collection of information about six different models. One source just gives me the cost and the name; another source gives the input data required, and the format of the output data, while another source gives me a personal account of the advantages and drawbacks of a particular model. How can I compare and contrast these models? How can I find the one that is best for what I need? If I had a standardized method for describing these models, I could see the differences between them. This standardized list of "elements" is what I have created in a Computer Model Metadata Standard, or a "set of rules" for describing a computer model.

Creating a standard

In forming a standard, I basically went through four steps: Seeing how others have described models, finding out what a potential computer model user wants to know, making some decisions based on my findings, and modifying the decisions based on feedback.

How have others described models?

This process of my research included basically two processes. One was to go and interview a number of people on campus that either have models that they have created, or have models that they want to share. I have asked them to describe the model to me. I took note of the parts of the model that they have brought up, or have mentioned. The second step was to find resources for models on the Internet. I basically found two different types of sites that had data about models. The first type was organizations or companies that had models that they were using, giving away, or selling. These sources would typically have a small number of models or one specific model. There descriptions would usually be in a textual format, similar how one person would describe to another a model. The other source of Internet model metadata was model registries. These groups would usually have a special interest, and their metadata would usually consist of and element list where one would enter a brief description of the model. In the following graph, (fig.2) I have included a spreadsheet of the different elements provided as well as the organization who produced the metadata.

What does a potential user want to know about a model?

This phase of the standard creation primarily consisted of interviewing professors and researchers that use models. I asked them questions about what they look when seeking out models. What elements are important for them? What helps them decide what model is best suited for their purpose? I was able to better decide what of the elements that I found in the first step were most important.

As well as speaking to potential model users, I also spoke with a few model creators. These sources were able to let me know what type of metadata has been collected about their models. I was further able to assess the elements that are useful for models.

Making some decisions.

In one of my interviews, the person suggested that I write a standard. A standard is basically a set of rules on recording information, more specifically, the rules for recording information about a computer model. This would be a good way to organize the ideas and thoughts that I was collecting. It would also be a step in the direction of producing a unified method of recording models.

A few challenges faced me at this step. First and foremost was the fact that I hadnít ever worked with, let alone write, a standard. I needed to research other standards in order to understand how this set of rules was written. I looked at two different standards that have been written for the collection of Geospatial Data.

These two standards presented the guidelines for writing the rules of metadata. One of the things that I found was how to state whether the element was required or optional. I also began to understand organizational methods for organizing the different elements into compound elements (sections of elements).

After several days of punching away at the standard, I came up with a finished product. This led me to the next step.

Seeking out feedback

Here at UC Santa Barbara, as part of the Library, there is a Map and Imagery Library. This library has begun to go digital, making the information available over the Internet. This digital collection is referred to as the Alexandria Digital Library. Within this group of people implementing the ADL, there are several people interested in incorporation of computer models as part of the library collection. This group consists of computer scientists, librarians, and geographers. These people need a type of standard to catalog these models that they hope to collect. With this in mind, they have a strong interest in the structure of the metadata standard. I sent a copy of the standard that I put together to this group, asking for feedback and thoughts. This was only several weeks ago.

Making modifications according to feedback

I was surprised that I didnít get feedback from more people in the group, but I did hear back form two people. One of these people was Linda Hill.

Linda Hill received her education in information organization and has worked for quite a while with standards. She directed me with a good number of decisions on changes to make. For example, the numbering system that I had originally used to organize the compound elements was confusing, having never been used in a standard. She also gave me direction in the need for requiring some elements and having some of the elements optional.

I took the suggestions to heart, and created a second draft of the standard. To date, this is the most current Metadata Standard for Computer Models I have done.

The Future of the Content Standard for Computer Model Metadata

I am currently working on the third draft of the standard. The standard will soon be growing in circulation. Dr. Goodchild will be taking it to a conference of geographic models. Linda Hill will also be showing it to people at NASA where they are interested in the same question of model collection. I am sure that I will continue to get feedback from several more sources from this increased circulation and the need for several more revisions will be needed.

The following steps will include a type of implementation of the standard. I hope to organize a number of models using this standard and begin creating a database for models. With this, I will also need to organize some type of search engine or organized method for retrieving the models.

The long-tem goal for the standard is to have it accepted as "the" way to organize models. This is a long process, and would be difficult for any single person or school to have something accepted on a global scale. If I can implement the standard, and show itís utility, that will make the standard more acceptable.

Conclusion

Computer models donít allow us to share information, but rather allow us to share our understanding of a natural system. If we donít have a standardized method for this transfer, we will be unable to share these thoughts and ideas of the natural system. Without a flow of knowledge, science cannot progress. It is through sharing out knowledge that is contained within models that we can allow science to progress. For this reason, it is essential that a standardized method is developed and implemented, so that knowledge and understanding can be transferred from one person to another.
 


General Model Information
http://helios.bto.ed.ac.uk/ierm/gcte3/links.htm
Ecological Metamodelling Links  A page full of links under the following headings relating to modeling platforms:- graphical modelling environments; modelling languages, systems which offer some element of modularity in model construction and specialized modeling systems, programming support for modeling: e.g., Fortran subroutine libraries, object-oriented modeling systems.
Also, there are sections on: model databases, simulation and modeling organizations, other web resources on modeling, documents relevant to metamodelling.

Developing an Easier Method for Metadata Collection
    One issue that I have heard repeatedly in my research of metadata is the cost of developing it.  I feel that there must be an easier way, and I have described it below in a power point presentation.  After I described my idea to a computer scientist, they described it as follows: "Providing a Web Interface to populate a digital library with object metadata." (download the Power point Presentation)


Model Validation
In model documentation, the issue of reporting the "Fitness of Use" or "Validation" for a computer model is of concern.  Below is a collection of articles etc. related to the topic.

http://www.first.gmd.de/applications/short/wmz.html
Validation of a Photochemical Model for Eastern Austria  A small paper describing the validation of a model used to detect an ozone model over Austria.

http://dao.gsfc.nasa.gov/DAO_people/dee/publications.html
A large list of articles and reports dealing with Model Validation

http://www.dmu.dk/AtmosphericEnvironment/harmoni/m_v_kit.htm
A "model Validation kit" which includes a number of datasets which can be run on a particular model to help evaluate the model.

http://www.cpm.mmu.ac.uk/cpmrep25.html
Validation and Verification of Computational Models with Multiple Cognitive Agents  By: Scott Moss, Bruce Edmonds and Steve Wallis  Date: July 1997

http://www.nerc-essc.ac.uk/Science/Science95_97/10Floods.html
An article describing techniques for validating flood models

Dee, Dick., A pragmatic approach to model validation, Coastal and Estuatine Studies, 47 (1995) Chapter 1 in: Quantitative Skill Assessment for Coastal Ocean Models, D. R. Lynch and A. M. Davies, Eds., American Geophysical Union, 510 pp.

Charpa, Steven C., Confirmation of water quality models, Ecological Modeling, 20 (1983) 113-133

Landry, Maurice, et al., Model validation in operations research, European Journal of Operational research, 14 (1983) 207-220

Loehle, Craig., Evaluation of theories ans calculation tools in ecology, Ecology Modeling, 19 (1983) 239-247

Maley, C.C., Models in evolutionay ecology and the validation problem

Dee, Dick, Guidelines for documenting the validity of computational modeling software, IHAR Report, June 1994

Dee, D. P., and M. J. van der Marel, Validation of computer models: Concepts and terminology.  Delft Hydraulics Report X84, 1991,  Delft Hydraulics, Delft

Dee, D. P., A framework for the validation of generic computational models. Delft Hydraulics Report X109, 1993,  Delft Hydraulics, Delft.

Hamilton, Martin A., Model validation: An annoted bibliography, Commun. Statist. Theory Meth., 20 (7), (1991), 2207-2266

Costanza, Robert and Skylar, Fred H., Articulation, accuracy, and effectiveness of mathematical models: A review of freshwater wetland applications. Ecological Modeling, 27 (1985) 45-68

Addiscott, Tom, et al, Critical evaluation of models and their parameters. Journal of Environmental Quality. 24, (1995) 803-807

Power, M., The predictive validation of ecological and environmental models. Ecological Modeling, 69 (1993) 33-50

van Deursen, W.P.A., Geographical Information Systems and Dynamic Models, Utecht, 1995

Oreskes, N., Shrader-Frechette, K., Belitz, K. (1994) Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences, Science 263, 4 Feb. 1994, 641-645

I have also written a short paper discussing a number of methods for determining model validation.  A Question of Quality: Practical Approaches to Describing Model Quality in Cataloging & Model Metadata Collection


Metadata and Cataloging: Methods & Ideas
http://www.ala.org/alcts/organization/ccs/ccda/tf-meta1.html
Association for Library Collections and Technical Services,  Task Force on Metadata.  Includes a list of nearly 20 people on this task force with Mary Larsgaard as the Chair.

http://purl.org/dc/
The home page for the Dublin Core Metadata Initiative.  A cataloging tool for internet resources.

       http://www.lub.lu.se/cgi-bin/nmdc.pl
       A ďDublin Core Metadata Template".

http://maphost.dfg.ca.gov/wetlands/metadata/wet_met.htm
An example of a web page set up to deliver geospatial metadata.  This example is the Northern California Wetlands and Riparian GIS data put out by the Dept. of Fish and Game.

http://iee.umces.edu/~villa/IMA/
"A proposal for a unified modeling architecture, a conceptual framework encompassing many (ideally most) generalizations used nowadays in modeling the natural world. We hope that this effort will eventually allow us to construct truly general, multi-paradigm system models in a unified, collaborative development environment."

http://www.ifla.org/II/metadata.htm
The IFLANET "Digital Libraries:  Metadata Resources"  A large list of links dealing with metadata for digital libraries.

http://spsosun.gsfc.nasa.gov/Meta_slide2.html
A nice chart that describes different compound elements of metadata and the purpose behind them.

http://oaspub.epa.gov/edr/EPASTD$.STARTUP
An introductory page to a few data standards used for metadata collection for EPA data sets.
 


Research Papers / Articles / Conference Reports on Metadata
http://www.xml.com/xml/pub/98/06/rdf.html
In explaining the purpose of their product, (RDF a type of computer based cataloging system) they give a really basic explanation of, and use for metadata.

http://www.lic.wisc.edu/metadata/metaprim.htm
A paper put out by the National States Geographic Information titled Metadata Primer -- A "How To" Guide on Metadata  "This primer is designed to provide a practical overview of the issues associated with developing and maintaining metadata for digital spatial data. It is targeted toward an audience of state, local, and tribal government personnel. The document provides a "cook book" approach to the creation of metadata."

http://rockyweb.cr.usgs.gov/nmpstds/metastds.html
A collection of nine documents (in .pdf format) describing USGS Geospatial Metadata Standards.

http://www.dlib.org/dlib/july96/new/07smith.html#SECTION00020000000000000000
The Meta-Information Environment of Digital Libraries, Terence R. Smith, Director, Alexandria Digital Library Project
University of California at Santa Barbara, Santa Barbara, CA 93106
 D-Lib Magazine, July/August 1996

http://fgdc.er.usgs.gov/
The Federal Geographic Data Committee's (FGDC) home page.  They are the coordinators of the National Spatial Data infrastructure, and their page has a number of links to papers addressing metadata.

http://www.fgdc.gov/metadata/contstan.html
Content Standard for Digital Geospatial Metadata (CSDGM)  The standards for Metadata set forth by the FGDC.  More general for all types of Geospatial information, not addressing geographic models.
http://fgdc.er.usgs.gov/metadata/metadata.html
A brief description of what is metadata.
http://www.gis.state.mn.us/stds/metadata.htm
The Minnesota Geographic Metadata Guidelines.  Again, general in the sense of "geographic data"

http://csdl.tamu.edu/DL95/papers/kacmar/kacmar.html
Automatic Creation and Maintenance of an Organizational Spatial Metadata and Document Digital Library  "The focus of this paper is a digital library system that is a collection of software components that are designed to support all aspects of [geo]spatial metadata and document collection and management for the eleven member agencies in Florida's GMDNCC."

http://geology.usgs.gov/tools/metadata/tools/doc/ctc/
A nice tutorial from the USGS on the use of FGDCís standard metadata for geospatial data

http://www.library.ucsb.edu/people/larsgaard/mulinher.html
MULTILEVEL DESCRIPTION, MULTILEVEL INHERITANCE, RELATIONS/LINKS: CONTENT AND CARRIER
Mary Larsgaard

http://computer.org/proceedings/meta/1999/papers/55/jfrew.htm
Generic Query Metadata for Geospatial Digital Libraries, James Frew, Michael Freeston, Linda Hill, Greg Janée, Mary Larsgaard, Qi Zheng

http://www.mathematik.uni-osnabrueck.de/projects/workshop97/papers/larsgard7.12.html
Metadata applied to Digitized Images in a Web Environment, Mary Larsgaard

http://kmi.open.ac.uk/projects/scholonto/
Scholarly Ontologies Project.  "The ScholOnto Project is concerned with techniques and technologies for scholarly publishing and discourse, which focus on the representation and analysis of the conceptual knowledge networks in which research documents are embedded. The infrastructure we are developing to support research communities builds on the OCML ontological modeling language (Motta), the WebOnto graphical environment for editing and sharing OCML ontologies (Domingue), and the LispWeb CommonLisp HTTP server (Ramoni)."   links to a number of papers relating to knowledge-based digital libraries.

http://www.dlese.org/Metadata/conference/index.htm
A workshop agenda for DLESE (Digital library for Earth System Education) that will "examine the challenge of describing digital resources (that is attaching metadata) so that community members can find desired educational resources."  There are a number of links to other "resources".
           www.delese.org/GUI
            One can download a few of their power point presentations here.

http://www.MDCinfo.com/index.html
The Metadata Coalition.  "The Coalition allies software vendors and users with a common purpose of driving forward the definition, implementation and ongoing evolution of a metadata interchange format standard and its support mechanisms."