Includes the functionality of all of the following:
STATISTICA Multivariate Exploratory Techniques offers a broad selection of exploratory techniques, from cluster analysis to advanced classification trees methods, with a comprehensive array of interactive visualization tools for exploring relationships and patterns; built-in complete Visual Basic scripting.
STATISTICA Advanced Linear/Nonlinear Models contains a wide array of the most advanced linear and nonlinear modeling tools on the market, supports continuous and categorical predictors, interactions, hierarchical models; automatic model selection facilities; also, includes variance components, time series, and many other methods; all analyses include extensive, interactive graphical support and built-in complete Visual Basic scripting.
STATISTICA Power Analysis and Interval Estimation is an extremely precise and user-friendly research tool for analyzing all aspects of statistical power and sample size calculation.
Includes the functionality of all of the following:
STATISTICA Quality Control Charts offers versatile presentation-quality charts with a selection of automation options, customizable features, and user-interface shortcuts to simplify routine work.
STATISTICA Process Analysis is a comprehensive package for process capability, Gage R&R, and other quality control/improvement applications.
STATISTICA Design of Experiments features the largest selection of DOE, visualization and other analytic techniques including powerful desirability profilers and extensive residual statistics.
STATISTICA Power Analysis and Interval Estimation is an extremely precise and user-friendly research tool for analyzing all aspects of statistical power and sample size calculation.
STATISTICA Automated Neural Networks contains a comprehensive array of statistics, charting options, network architectures, and training algorithms; C and PMML (Predictive Model Markup Language) code generators. The C code generator is an add-on.
Fully integrated with the STATISTICA system.
Includes the functionality of all of the following:
STATISTICA Automated Neural Networks
STATISTICA Data Miner contains the most comprehensive selection of data mining solutions on the market, with an icon-based, extremely easy-to-use user interface. It features a selection of completely integrated, and automated, ready to deploy "as is" (but also easily customizable) specific data mining solutions for a wide variety of business applications. The product is offered optionally with deployment and on-site training services. The data mining solutions are driven by powerful procedures from five modules, which can also be used interactively and/or used to build, test, and deploy new solutions.
A large portion of analytic functionality used by STATISTICA Data Miner is driven by the computational engines of modules that are included in various other STATISTICA products:
However, several modules include selections of highly specialized data mining and data mining modeling techniques that are offered only as part of STATISTICA Data Miner. The following these modules
STATISTICA Text Miner is an optional extension of STATISTICA Data Miner. The program features a large selection of text retrieval, pre-processing, and analytic and interpretive mining procedures for unstructured text data (including Web pages), with numerous options for converting text into numeric information (for mapping, clustering, predictive data mining, etc.), language-specific stemming algorithms. Because STATISTICA�#8482;s flexible data import options, the methods available in STATISTICA Text Miner can also be useful for processing other unstructured input (e.g., image files imported as data matrices, etc.).
The program contains numerous options for accessing text documents in different formats, including .txt (text), .pdf (Adobe), .ps (PostScript), .html, .xml (Web-formats), and most Microsoft Office formats (e.g., .doc, .rtf).
Flexible user interface options (and automation functions) are provided for selecting large numbers of files via wild-cards (e.g., to select all documents in a particular subdirectory structure).
The program supports full "Web-crawling" capabilities, so that documents can be extracted from the Web, starting at a particular root Web page (URL). All documents linked to that particular page will be included, as well as the documents linked to those sub-documents, and so on, up to a user-specified level or depth.
File names and URLs can also be stored in text variables, in STATISTICA data files. In this manner, the program can not only process actual text stored in text variables, but also properly interpret references to text documents or URLs. Thus, numeric information and textual information (large documents) can be stored on a per-case (observation) basis and meaningful analyses can be performed on data files where for each observation numeric as well as (voluminous) unstructured textual information is available (e.g., patients' age, height, weight, along with physicians narrative description of symptoms).
Options are provided to flexibly import such lists of filenames or URLs into the columns of a STATISTICA spreadsheet.
Documents can be preprocessed, prior to (actually concurrent with the) indexing of all documents. Exclusion rules and stub-lists can be applied to remove common but not useful words like "a", "the", "to", "is". Then a stemming algorithm is applied so that English words like "traveled", "traveling" both count as instances of "travel".
Next, the program will index the "stubbed-and-stemmed" documents, to create a frequency count of all words and for all documents. This "raw-data" (count) information is the basis for all subsequent numerical analyses.
Before creating a STATISTICA Data File containing the counts (etc.) to summarize the documents, various additional filters may be applied. For example, the counts for particular (most frequent) words per document can be:
The resulting data file with numeric information (e.g., SVD dimensions, raw counts, relative counts, most-frequent-word counts, and so on) is then ready for further analyses.
Various options are provided for writing the information extracted from text into the input data file, or directly into external databases (see also the description of STATISTICA In-Place Database Processing technology).
All statistical analysis methods can be applied to the numeric summaries representing the texts. Simple summary statistics may extract the most common words used in the documents.
By mapping the documents into the SVD dimensions (e.g., via PCA), dimensional maps of documents can be created, to evaluate the similarity of documents, etc.
By mapping documents into dimensions based on original (transformed) word counts, simultaneous maps of documents and words can be created. This reflects the "meaning" of documents.
Clustering techniques (such as EM or k-Means) can be applied to identify clusters of similar documents.
Predictive data mining techniques can be used to relate the numerical summaries of documents to other indicators of interest, e.g., fraudulent intent, medical diagnosis, and so on.
Key analytic components requiring extensive data processing are implemented via multi-threaded computing technology, to extract optimum performance from advanced multiple-processor server hardware.
WebSTATISTICA is offered as a complete solution that includes the analytic functionality of the respective selected STATISTICA product or any combination of STATISTICA products.
One of the clearest advantages offered by the WebSTATISTICA technology is that it makes the power of any of the STATISTICA family of products conveniently available anywhere by any workstation equipped with an industry-standard Web browser. Thus, WebSTATISTICA add a new dimension and an endless array of new possibilities and applications to the entire line of STATISTICA Data Analysis, Data Mining, Quality Control, and Six Sigma software.
WebSTATISTICA supports multiprocessor environments and works with load balanced environments, making WebSTATISTICA suitable for internal cloud computing environments.
WebSTATISTICA support one or more customized Web-based analytic applications to suit an organization's specific needs. Users log in and see a highly-targeted user interface customized for the particular application needs. Users have single-click access to the desired set of queries, analysis results, and reports, all displayed within their Web browser.
The full power of STATISTICA analytics1 is available via the server-based, Wide Area Network (WAN) architecture, providing all of the advantages of no client software to install, central configuration and ongoing management, increased scalability and performance, and highly-interactive user experience.
For example, the most recent data and reports (e.g., updated via queries to the specific parts of the corporate data warehouse) - with options to interactively drill down into the results and interactively obtain additional, specific insights about the business - can now be made available to authorized employees wherever they are and regardless of the type of computers to which they have access. Wherever there is the Internet (which means virtually ...everywhere), there is now also access to the query, reporting, and analytic tools of the most comprehensive data analysis system available.
WebSTATISTICA Server acts as a core of an enterprise-wide network system allowing the participants to work collaboratively, quickly share results (reports), as well as scripts of analyses or queries. User or group permissions can be used by the administrators to manage access of specific groups of users to specific data or reports. The accessibility of its tools via the Internet makes WebSTATISTICA Server a perfect system to facilitate collaborative projects of employees working at different locations or branches of a corporation (even on different continents), or employees who are telecommuting or traveling.
WebSTATISTICA Knowledge Portal - is a powerful, Web-based, knowledge-sharing tool that allows your colleagues, employees, and/or customers (with appropriate permissions) to log in and quickly and efficiently get access to the information they need, by reviewing predefined reports.
WebSTATISTICA Interactive Knowledge Portal - offers to the portal visitors all the functionality of the Knowledge Portal and additional options. These options include allowing the user to define and request new reports, run queries and custom analyses, drill down and up, slice/dice data, and gain insight from all resources that are made available to them by the portal designers or administrators.
STATISTICA Enterprise Web Viewer provides the ability to view analyses and reports that were generated within STATISTICA Enterprise or STATISTICA Enterprise / QC. This allows companies to protect their data and reports with the STATISTICA Enterprise security model.