DataMelt

DataMelt
DataMelt' (or, in short, DMelt) a free computation and visualization environment is an interactive framework for scientific computation, data analysis and data visualization designed for scientists, engineers and students. DataMelt is multiplatform since it is written in Java, thus it runs on any operating system where the Java virtual machine can be installed.

The program is designed for statistical data analysis, curve fitting, data-mining algorithms, numeric computations and interactive scientific plotting in 2D and 3D. DataMelt uses high-level programming languages, such as Jython, Groovy, JRuby, but Java coding can also be used to call DataMelt numerical and graphical libraries.

DataMelt is an attempt to create a data-analysis environment using open-source packages with a coherent user interface and tools competitive to commercial programs. The idea behind the project is to incorporate open-source mathematical and numerical software packages with GUI-type user interfaces into a coherent program in which the main user interface is based on short-named Java/Python classes. This was required to build an analysis environment using Java scripting concept. A typical example will be shown below.

Scripts and Java code (in case of the Java programming) can be run either in a GUI editor of DataMelt or as Batch processing. The graphical libraries of DataMelt can be used to create Java applets. All charts (or "Canvases") used for data representation can be embedded into web browser.

DataMelt can be used for analysis of large numerical data volumes, data mining, statistical data analysis and mathematics are essential. The program can be used in natural sciences, engineering, modeling and analysis of financial markets. While the program falls into the category of open source software, it is not completely free for commercial usage (see below), no source code is available on the home page, and all documentation and even bug reporting requires "membership".

Overview
DataMelt has several features for data analysis:


 * uses Jython, BeanShell, Groovy, JRuby scripting, or the standard Java. The GNU Octave mode is also available for symbolic calculations;
 * can be integrated with the Web in forms of applets or Java Web-start applications, thus it is suited for distributed analysis environment via the Internet;
 * has a full-featured IDE with syntax highlighting, syntax checker, code completion and analyser. It includes a version of IDE for small-screen devices;
 * includes a help system with a code completion based on the Java reflection technology;
 * Data can be written in C++ and analyzed using Java/Jython. It has a browser for serialized objects and objects created using Google Protocol Buffers;
 * includes SQL-based and NoSQL databases
 * includes packages for statistical calculations, error (uncertainty) propagation using a linear expansion or a Monte Carlo approach for arbitrary function, symbolic calculations similar to those found in the GNU Octave project or MATLAB, but rewritten in Java;

DataMelt has its roots in particle physics where data mining is a primary task. It was created as jHepWork project in 2005 and it was initially written for data analysis for particle physics using the Java software concept for the International linear collider project developed at SLAC. Later versions of jHepWork were modified for general public use (for scientists, engineers, students for educational purpose) since the International Linear Collider project has stalled. In 2013, jHepWork was renamed to DataMelt and become a general-purpose community-supported project. The main source of the reference is the book "Scientific Data analysis using Jython Scripting and Java" which discusses data-analysis methods using Java] and Jython scripting. Later it was also discussed in the German Java SPEKTRUM journal . The string "HEP" in the project name "jHepWork" abbreviates "High-Energy Physics". But due to a wide popularity outside this area of physics, it was renamed to SCaViS (Scientific Computation and Visualization Environment). This project existed for 3 years before it was renamed to DataMelt (or, in short, DMelt).

DataMelt is hosted by the jWork.ORG portal

Supported platforms
DataMelt runs on Windows, Linux, Mac and the Android operating systems. The package for the Android is called AWork.

Documentation
DataMelt is extensively documented. In 2018, the web page of this project contained about 600 examples written in Jython, Java, Groovy, JRuby, covering a number of fields, from general mathematics to data mining and data visualization. The Java API documentation includes the description of more than 40,000 Java classes. In addition, there is a wiki documentation. The documentation includes certain restrictions  for general public due to the proprietorial nature of the documentation project.

License terms
The DataMelt core source code of the numerical and graphical libraries is licensed by the GNU General Public License. The interactive development environment (IDE) used by DataMelt has some restrictions for commercial usage since language files, documentation files, examples, installer, code-assist databases, interactive help are licensed by the creative-common license. Full members of the DataMelt project have several benefits, such as: the license for a commercial usage, access to the source repository, an extended help system, a user script repository and an access to the complete documentation.

The commercial licenses cannot apply to source code that was imported or contributed to DataMelt from other authors.

Jython scripts
Here is an example of how to show 2D bar graphs by reading a CVS file downloaded from the World Bank web site.

The execution of this script plots a bar chart in a separate window. The image can be saved in a number of formats.

Here is another simple example which illustrates how to fill a 2D histogram and display it on a canvas. The script also creates a figure in the Portable Document Format (PDF) format. This script illustrates how to glue and mix the native JAVA classes (from the package java.util) and DataMelt classes (the package jhplot) inside a script written using the Python syntax.

This script can be run either using DataMelt IDE or using a stand-alone Jython after specifying classpath to DataMelt libraries. The output is shown below:



Groovy scripts
The same example can also be coded using the Groovy programming language which is supported by DataMelt.

Reviews
DataMelt and its earlier versions, SCaVis (2013-2015) and JHepWork (2005-2013), which are still available from DataMelt archive repository, are described in these articles: The program was compared with other similar frameworks in these resources .

The DataMelt (2015-), a new development of the JHepWork and SCaVis programs. Comparisons of DataMelt with other popular packages for statistical and numeric analysis are given in these resources . According to more recent surveys of online articles and blogs on data science, DataMelt is among popular data-analysis packages prior 2019.

Popularity
jHepWork, SCaVis/DatMelt are part of the software library of National Institutes of Health Library , Mathematical support of Institute for Nuclear Research of Russian academy of Sciences and others. On a commercial site, DataMelt is provided as a service on Amazon EC2 clouds by the Miri Infotech IT Solution Provider company .