Apache Tajo google
A big data warehouse system on Hadoop. Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities. …

Initial Data Analysis (IDA) google
The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that is aimed at answering the original research question. The initial data analysis phase is guided by the following four questions:
• Quality of data
• Quality of measurements
• Initial transformations
• Did the implementation of the study fulfill the intentions of the research design? …

Relative Risk google
In statistics and epidemiology, relative risk or risk ratio (RR) is the ratio of the probability of an event occurring (for example, developing a disease, being injured) in an exposed group to the probability of the event occurring in a comparison, non-exposed group. Relative risk includes two important features:
(i) a comparison of risk between two ‘exposures’ puts risks in context, and
(ii) ‘exposure’ is ensured by having proper denominators for each group representing the exposure …