Other language confidence: 0.9796563723402958
This publication contains tabular summaries of the data management survey carried out at the Helmholtz Centre Potsdam GFZ German Research Centre for Geosciences, as well as the diagrams of individual questions shown in Radosavljevic et al. (2019).The online survey was conducted from August 27 to September 27, 2019. The survey design leaned on similar surveys carried out at German universities and research institutions (e.g. Paul-Stüve et al., 2015; Simukovic et al., 2013) The survey queried aspects of the complete data life cycle - from the planning stage to reuse in 37 questions: 16 single response (SR); where only one answer was possible, and 20 multiple response (MR) where multiple answers could be selected, and one free text question. Research staff at all career levels was the target audience for the survey. Invitations to participate in the completely anonymous online survey were sent out over the general GFZ lists. The survey was carried out with the Questback EFS Survey platform.226 attempts, out of 411, led to completed questionnaires corresponding to a 55% completion rate. Compared to the target audience at GFZ, the participation rate amounted to ca. 24%. However, less than 20% of employees classified as infrastructure support employees or bachelor’s and master’s students and student assistants completed the survey. Replies falling into these categories were grouped into “others” in the report as well as in the data presented here.Data summaries are given in two tab-separated tables corresponding to response counts or percentage for each question. These are grouped by department, role and employment length. Questions 5 and 34 were ranking questions and the corresponding responses in the percentages table represent arithmetic means of the replies for these questions – not percentages. The response counts for these question are presented in the “Counts” table. Free text replies are omitted from these results. In addition, the diagrams of individual questions are presented Radosavljevic et al. (2019) are also provided in png and pdf formats.
The success of scientific projects increasingly depends on using data analysis tools and data in distributed IT infrastructures. Scientists need to use appropriate data analysis tools and data, extract patterns from data using appropriate computational resources, and interpret the extracted patterns. Data analysis tools and data reside on different machines because the volume of the data often demands specific resources for their storage and processing, and data analysis tools usually require specific computational resources and run-time environments. The data analytics software framework DASF, developed at the GFZ German Research Centre for Geosciences (https://www.gfz-potsdam.de) and funded by the Initiative and Networking Fund of the Helmholtz Association through the Digital Earth project (https://www.digitalearth-hgf.de/), provides a framework for scientists to conduct data analysis in distributed environments. The data analytics software framework DASF supports scientists to conduct data analysis in distributed IT infrastructures by sharing data analysis tools and data. For this purpose, DASF defines a remote procedure call (RCP) messaging protocol that uses a central message broker instance. Scientists can augment their tools and data with this protocol to share them with others. DASF supports many programming languages and platforms since the implementation of the protocol uses WebSockets. It provides two ready-to-use language bindings for the messaging protocol, one for Python and one for the Typescript programming language. In order to share a python method or class, users add an annotation in front of it. In addition, users need to specify the connection parameters of the message broker. The central message broker approach allows the method and the client calling the method to actively establish a connection, which enables using methods deployed behind firewalls. DASF uses Apache Pulsar (https://pulsar.apache.org/) as its underlying message broker. The Typescript bindings are primarily used in conjunction with web frontend components, which are also included in the DASF-Web library. They are designed to attach directly to the data returned by the exposed RCP methods. This supports the development of highly exploratory data analysis tools. DASF also provides a progress reporting API that enables users to monitor long-running remote procedure calls. One application using the framework is the Digital Earth Flood Event Explorer (https://git.geomar.de/digital-earth/flood-event-explorer). The Digital Earth Flood Event Explorer integrates several exploratory data analysis tools and remote procedures deployed at various Helmholtz centers across Germany.
The success of scientific projects increasingly depends on using data analysis tools and data in distributed IT infrastructures. Scientists need to use appropriate data analysis tools and data, extract patterns from data using appropriate computational resources, and interpret the extracted patterns. Data analysis tools and data reside on different machines because the volume of the data often demands specific resources for their storage and processing, and data analysis tools usually require specific computational resources and run-time environments. The data analytics software framework DASF, developed at the GFZ German Research Centre for Geosciences (https://www.gfz-potsdam.de) and funded by the Initiative and Networking Fund of the Helmholtz Association through the Digital Earth project (https://www.digitalearth-hgf.de/), provides a framework for scientists to conduct data analysis in distributed environments. The data analytics software framework DASF supports scientists to conduct data analysis in distributed IT infrastructures by sharing data analysis tools and data. For this purpose, DASF defines a remote procedure call (RPC) messaging protocol that uses a central message broker instance. Scientists can augment their tools and data with this protocol to share them with others. DASF supports many programming languages and platforms since the implementation of the protocol uses WebSockets. It provides two ready-to-use language bindings for the messaging protocol, one for Python and one for the Typescript programming language. In order to share a python method or class, users add an annotation in front of it. In addition, users need to specify the connection parameters of the message broker. The central message broker approach allows the method and the client calling the method to actively establish a connection, which enables using methods deployed behind firewalls. DASF uses Apache Pulsar (https://pulsar.apache.org/) as its underlying message broker. The Typescript bindings are primarily used in conjunction with web frontend components, which are also included in the DASF-Web library. They are designed to attach directly to the data returned by the exposed RPC methods. This supports the development of highly exploratory data analysis tools. DASF also provides a progress reporting API that enables users to monitor long-running remote procedure calls. One application using the framework is the Digital Earth Flood Event Explorer (https://git.geomar.de/digital-earth/flood-event-explorer). The Digital Earth Flood Event Explorer integrates several exploratory data analysis tools and remote procedures deployed at various Helmholtz centers across Germany.
The increasingly high number of big data applications in seismology has made quality control tools to filter, discard, or rank data of extreme importance. In this framework, machine learning algorithms, already established in several seismic applications, are good candidates to perform the task flexibility and efficiently. sdaas (seismic data/metadata amplitude anomaly score) is a Python library and command line tool for detecting a wide range of amplitude anomalies on any seismic waveform segment such as recording artifacts (e.g., anomalous noise, peaks, gaps, spikes), sensor problems (e.g., digitizer noise), metadata field errors (e.g., wrong stage gain in StationXML). The underlying machine learning model, based on the isolation forest algorithm, has been trained and tested on a broad variety of seismic waveforms of different length, from local to teleseismic earthquakes to noise recordings from both broadband and accelerometers. For this reason, the software assures a high degree of flexibility and ease of use: from any given input (waveform in miniSEED format and its metadata as StationXML, either given as file path or FDSN URLs), the computed anomaly score is a probability-like numeric value in [0, 1] indicating the degree of belief that the analyzed waveform represents an anomaly (or outlier), where scores ≤0.5 indicate no distinct anomaly. sdaas can be employed for filtering malformed data in a pre-process routine, assign robustness weights, or be used as metadata checker by computing randomly selected segments from a given station/channel: in this case, a persistent sequence of high scores clearly indicates problems in the metadata
DASF: Progress API is part of the Data Analytics Software Framework (DASF, https://git.geomar.de/digital-earth/dasf), developed at the GFZ German Research Centre for Geosciences (https://www.gfz-potsdam.de). It is funded by the Initiative and Networking Fund of the Helmholtz Association through the Digital Earth project (https://www.digitalearth-hgf.de/). DASF: Progress API provides a light-weight tree-based structure to be sent via the DASF RCP messaging protocol. It's generic design supports deterministic as well as non-deterministic progress reports. While DASF: Messaging Python provides the necessary implementation to distribute the progress reports from the reporting backend modules, DASF: Web includes ready to use components to visualize the reported progress.
The presented datasets and scripts have been obtained for testing the performance of a trigger algorithm for use in combination with a ringshear tester ‘RST-01.pc’. Glass beads (fused quartz microbeads, 300-400 µm diameter) and thai rice are sheared at varying velocity, stiffness and normal load. The data is provided as preprocessed mat-files ('*.mat') to be opened with Matlab R2015a and later. Several scripts are provided to reproduce the figures found in (Rudolf et al., submitted). A detailed list of files together with the respective software needed to view and execute them is available in 'List_of_Files_Rudolf-et-al-2018.pdf' (also available in MS Excel Format). More information on the datasets and a small documentation of the scripts is given in 'Explanations_Rudolf-et-al-2018.pdf'. The complete data publication, including all descriptions, datasets, and evaluation scripts is available as 'Dataset_Rudolf-et-al-2018.zip'.
This version of Shakyground (V.1.0) comprise several Python3 scripts and returns the median values of spatially-distributed ground motion fields for a selected area and a given synthetic earthquake rupture. These values are simulated by means of a set of GMPEs (Ground Motion Prediction Equations) developed by several experts for specific tectonic areas. The outputs can be provided in community standard formats (.xml). A simple ipython notebook to visualise these results is also included.
This version of Quakeledger (V.1.0) is a Python3 program that can also be used as a WPS (Web Processing Service). It returns the available earthquake events contained within a given local database (so called catalogue) that must be customised beforehand (e.g. historical, expert and/or stochastic events). This is a rewrite from: https://github.com/GFZ-Centre-for-Early-Warning/quakeledger and https://github.com/bpross-52n/quakeledger. In these original codes, an earthquake catalogue had to be initially provided in .CSV format. The main difference with this version is that, this code is refactored and uses a SQLITE database. The user can find the parser code in: “quakeledger/assistance/import_csv_in_sqlite.py”
DASF: Messaging Python is part of the Data Analytics Software Framework (DASF, https://git.geomar.de/digital-earth/dasf), developed at the GFZ German Research Centre for Geosciences. It is funded by the Initiative and Networking Fund of the Helmholtz Association through the Digital Earth project (https://www.digitalearth-hgf.de/). DASF: Messaging Python is a RCP (remote procedure call) wrapper library for the python programming language. As part of the data analytics software framework DASF, it implements the DASF RCP messaging protocol. This message broker based RCP implementation supports the integration of algorithms and methods implemented in python in a distributed environment. It utilizes pydantic (https://pydantic-docs.helpmanual.io/) for data and model validation using python type annotations. Currently the implementation relies on Apache Pulsar (https://pulsar.apache.org/) as a central message broker instance.
This dataset presented herein originates from the JAGUARS (The Japanese German Underground Acoustic Emission Research in South Africa) project, which took place from 2007 to 2009 in Mponeng Gold Mine, South Africa. Project partners included Ritsumeikan University, Earthquake Research Institute University of Tokyo and Tohuku University in Japan, the German Research Center for Geosciences Potsdam and Gesellschaft für Materialprüfung und Geophysik GMuG mbH in Germany, as well as the Council for Scientific and Industrial Research in Johannesburg, Seismogen CC in Cartonville, Anglo Gold Ashanti Ltd and the Institute of Mining Seismology in the Republic of South Africa. This publication forms part of the Geo-INQUIRE initiative (HORIZON-INFRA-2021-SERV-01 call, project number 101058518). It is cross-referenced on the EPISODES Platform (https://episodesplatform.eu/?lang=en#episode:JAGUARS (not yet existing)), which is managed by the EPOS TCS AH (European Plate Observing System Thematic Core Service Anthropogenic Hazards). Within the EPISODES Platform, the datasets are consolidated into an “episode” titled “JAGUARS: Mining induced picoseismicity associated with gold mining”. The EPISODES Platform offers open access to the integrated research infrastructures of the EPOS TCS AH, enabling users to download data and utilize a range of basic online visualization tools to graphically represent and process the datasets directly within their personal workspace.
| Organisation | Count |
|---|---|
| Wissenschaft | 10 |
| Type | Count |
|---|---|
| unbekannt | 10 |
| License | Count |
|---|---|
| Offen | 10 |
| Language | Count |
|---|---|
| Englisch | 10 |
| Resource type | Count |
|---|---|
| Keine | 10 |
| Topic | Count |
|---|---|
| Boden | 5 |
| Lebewesen und Lebensräume | 4 |
| Luft | 1 |
| Mensch und Umwelt | 10 |
| Weitere | 10 |