Business Intelligence Project Initialization are usually
- Requests for Report, Dashboard, to visualize data stored in production database
- Requests to access data from various database and build global activity report, kpi projects
- Projects to align number with process, to set global rules for calculation of Kpi, to deliver legacy reports, etc …
By comparison Data Science project Initialization are more :
- Requests to understand why such data results are available
- Request to cross existing information with additional information, to add value to existing data
- Projects to try to build model to understand data, such as clustering, association, decision tree
- Projects to try to build forecasting & predictive models
Business Intelligence Project will Focus on
- Data Quality & Data consistency, using ETL & Data Quality tools
- Redefine rules to aggregate data, to standardize information, to clean data, using Master Data Management tools
- Loading Data into Data Warehouse (ODS, DWH and DTM parts), using ETL tools
Business Intelligence part
- Define Reports, Dashboard, KPI and Cube with end users, and adjust Data Mart structure to comply with the expectation
- Create Report, Dashboard, Cube and various Metadata to provide access to validated data
- Create visualization than highlight trend
Global Project Integration
- Define Workflow to process - for example - data loading + kpi calculation + report creation By comparison Data Science Project will Focus on
- Platform & Components, such as predictive language (R is recommended)
- External data analysis & integration : what are the external information which influence my data
- Analyzing data and building model to explain correlation between data, impact on data input modification
- Building statistics, analytics & predicative models
- Providing tools to advanced users to access data, visualize data, manipulate data
Business Intelligence part
A traditional Data Science Project – Components
Those terms are more or less similar, all targeting to study data. Business Analytics is used in comparison with Business Intelligence, and is a common naming in commercial projects & tender, where Data Science is more a name used in university & research Center.
Machine Learning is an extension of Data Science, we see it as a learning model that could improve itself by analyzing its results & using some adaptive parameters to reduce the difference between the “forecast” and the “real”.
For example, we built a model to forecast water consumption in convenience stores, but our model is able to run on itself to compare its prediction with the reality, and then is able to adjust its forecast algorithm to better take into account those differences between what the predictive model provided and the daily consumption in stores.
By the way, funny to notice in Business Intelligence, we had always this same kind of naming discussion, for example to explain the difference between Report and Dashboard
YES. Vanilla Air can run any R program and deploy any R packages, this is one of the strength of the platform. R is the de facto language to build and deploy Analytics & Predictive Models, with hundreds of thousands of developer worldwide, making easy the adoption of Vanilla for anybody who want to deploy his packages at enterprise level.
What is the advantage to use Vanilla Air to deploy my R packages?Using Vanilla Air , you immediately take advantage of a cluster ready platform, making it possible to run your R packages in a cluster of R services, a platform ready to scale with your growing requests for complex data analysis
Does Vanilla Air contains preprocessing features?
YES. Vanilla Air can run an impressive number of pre-processing methods on any dataset, like cluster, filter, classification, new column calculation (for class allocation), correlation between columns, making it easy for user to discover their dataset, manage quality
How Vanilla Air can help me to analyze my own data?
Vanilla Air comes with a Dataset interface to connect with any kind of external data, data available in any kind of database, csv/text files located on disk or Hdfs, and also Vanilla Hub dataset. We make it easy to manipulate the data before starting to build your analysis with R language (filter, data visualization, correlation …), by turning any Dataset into a R dataframe . Using Vanilla Air, no more headache to integrate text file or database dataset into a R program.
Can you tell me more about cubes insideVanilla Air?
Vanilla Air provides an interface to manipulate a dataset and create a virtual cube to view the dataset as a cube, with dimensions and measures, using Vanilla Analysis technologies. Cube analysis is an important part of data pre-analysis tasks, as it allows developer to create dimension that bundle together to explain key measures.
What kind of process can be scheduled?
Virtually any program can be scheduled, as Vanilla Air provides a Workflow interface to design and run complex process that can acquire data, run any R program, and save the result set or the program output in any database or text file. Any Workflow can be scheduled and even called from an external program using Web Services call or command line interface.
What is the services provided by Vanilla Hub?
Vanilla Hub provides connector to integrate various kind of data, including:
- Weather Data (Temperature, Humidity, Rain, Wind, actual data and forecast data),
- Financial data (gold price, oil price, exchange rate, stock exchange …)
- Social data (Facebook, Twitter, Pinterest, YouTube …)
- Any data from any Website, using crawling technology
- Connectors to standard ERP and CRM platforms
- Connectors to platforms like Google Analytics, Googled rive, Drobox, Nagios.
- Custom API integration to acquire customer data (database, csv or xml)
Using Vanilla Hub, you can virtually connect to any kind of database and schedule retrieval process
How Vanilla Hub runs together with Vanilla ETL?
Vanilla Hub can send any dataset to Vanilla ETL, in order for Vanilla ETL to run complex transformation and takes advantage of Vanilla Architect, our Master Data Management platform, to apply conversion rules on data. Vanilla Hub is more an EAI/ETL platform, focusing on data acquisition and data storage on Hadoop, it don’t overlap with Vanilla ETL features.
What kind of BI services is available with Smart Data?
Vanilla BI services – Olap, Dashboard, Kpi, Maps … and even Reports are available everywhere inside Vanilla Smart Data : we didn’t re-invent the wheel, Vanilla BI – the de-facto standard in Open Source Business Intelligence, has it all ! Integration is tight with designer interface :
- Inside Agila, user can design their own cubes, can generate dashboard and publish those dashboard on a Vanilla instance
- Inside Vanilla Air, user can design cube from the dataset interface
- Vanilla Air program can be used to load KPI values, that are later displayed on Maps
Why do you embed Vanilla ETL as a component of Vanilla Smart Data?
When it comes to acquire complex data in real time, we do believe there is a need to separate data acquisition process, as part of Vanilla Hub set of features, and data transformation process, which is taking in charge by ETL platform, such as Vanilla ETL.
Vanilla Hub, Smart Data real time data integration module, is able to deploy custom Plugin to access and collect any kind of data. Vanilla ETL provides bullet-proof infrastructure to run complex transformation and load data into any Big Data instance.
This is a major difference with other Data Science platform, which don’t provide complex transformation infrastructure, but only limited transformation process. We took again advantage of our experience with Vanilla Bi and Vanilla ETL platforms to make the difference in terms of ready-to-deploy infrastructure.