Informatica Interview Questions Part – 1
1. What is a Data Warehouse?
A Data Warehouse is a collection of data marts representing historical data from different operational data source (OLTP). The data from these OLTP are structured and optimized for querying and data analysis in a Data Warehouse.
2. What is a Data mart?
A Data Mart is a subset of a data warehouse that can provide data for reporting and analysis on a section, unit or a department like Sales Dept, HR Dept, etc. The Data Mart are sometimes also called as HPQS (Higher Performance Query Structure).
3. What is OLAP?
OLAP stands for Online Analytical Processing. It uses database tables (Fact and Dimension tables) to enable multidimensional viewing, analysis and querying of large amount of data.
4. What is OLTP?
OLTP stands for Online Transaction Processing Except data warehouse databases the other databases are OLTPs. These OLTP uses normalized schema structure. These OLTP databases are designed for recording the daily operations and transactions of a business.
5. What are Dimensions?
Dimensions are categories by which summarized data can be viewed. For example a profit Fact table can be viewed by a time dimension.
6. What are Confirmed Dimensions?
The Dimensions which are reusable and fixed in nature Example customer, time, geography dimensions.
7. What are Fact Tables?
A Fact Table is a table that contains summarized numerical (facts) and historical data. This Fact Table has a foreign key-primary key relation with a dimension table. The Fact Table maintains the information in 3rd normal form.
A star schema is defined is defined as a logical database design in which there will be a centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.
8. What are the types of Facts?
The types of Facts are as follows.
- Additive Facts: A Fact which can be summed up for any of the dimension available in the fact table.
- Semi-Additive Facts: A Fact which can be summed up to a few dimensions and not for all dimensions available in the fact table.
- Non-Additive Fact: A Fact which cannot be summed up for any of the dimensions available in the fact table.
9. What are the types of Fact Tables?
The types of Fact Tables are:
- Cumulative Fact Table: This type of fact tables generally describes what was happened over the period of time. They contain additive facts.
- Snapshot Fact Table: This type of fact table deals with the particular period of time. They contain non-additive and semi-additive facts
10. What is Grain of Fact?
The Grain of Fact is defined as the level at which the fact information is stored in a fact table. This is also called as Fact Granularity or Fact Event Level.
11. What is Factless Fact table?
The Fact Table which does not contains facts is called as Fact Table. Generally when we need to combine two data marts, then one data mart will have a fact less fact table and other one with common fact table.
12. What are Measures?
Measures are numeric data based on columns in a fact table.
15. What is a Star schema design?
A Star schema is defined as a logical database design in which there will be a centrally located fact table which is surrounded by at least one or more dimension tables. This design is best suited for Data Warehouse or Data Mart.
16. What is Snow Flake schema Design?
In a Snow Flake design the dimension table (de-normalized table) will be further divided into one or more dimensions (normalized tables) to organize the information in a better structural format. To design snow flake we should first design star schema design.
17. What is Operational Data Store [ODS] ?
It is a collection of integrated databases designed to support operational monitoring. Unlike the OLTP databases, the data in the ODS are integrated, subject oriented and enterprise wide data.
18. What is DE normalization?
DE normalization means a table with multi duplicate key. The dimension table follows DE normalization method with the technique of surrogate key
19. What is Surrogate Key?
A Surrogate Key is a sequence generated key which is assigned to be a primary key in the system (table)
20. What is Metadata?
Data about data is called as Metadata. The Metadata contains the definition of a data.
21. What is a Repository?
Repository is a centrally stored container which stores the metadata, which is used by the Informatica Power center server and Power Center client tools. The Informatica stores Repository in relational database format.
22. What is Data Acquisition Process?
The process of extracting the data from different source (operational databases) systems, integrating the data and transforming the data into a homogenous format and loading into the target warehouse database. Simple called as ETL (Extraction, Transformation and Loading). The Data Acquisition process designs are called in different manners by different ETL vendors.
- Informatica —-> Mapping
- Data Stage —-> Job
- Abinitio —-> Graph
23. What are the GUI based ETL tools?
The following are the GUI based ETL tools:
- Data Stage
- Data Junction
- Oracle Warehouse Builder
- Business Object Data Integrator
- Cognos Decision Stream.
24. What are programmatic based ETL tools?
- SAS BASE
- SAS ACCESS
- Tera Data Utilities
- Fast Load
- Multi Load
- Fast Export
- T (Trickle) Pump
25. What is a Transformation?
A transformation is a repository object that generates, modifies, or passes data. Transformations in a mapping represent the operations the PowerCenter Server performs on the data. Data passes into and out of transformations through ports that you link in a mapping or mapplet. Transformations can be active or passive. An active transformation can change the number of rows that pass through it. A passive transformation does not change the number of rows that pass through it.
26. What are the types of metadata stored in Repository?
The following types of metadata are stored in Repository:
- Database connections
- Global objects
- Multi-dimensional metadata
- Reusable Transformations
- Sessions and Batches
- Source Definitions
- Target definitions
27. What are the types of locks in Repository?
There are two types of Locks in Repository:
- Read Lock
- Write Lock
- Execute Lock
- Fetch Lock
- Save Lock
28. What are Repository objects which we can export?
We can export the following Repository objects:
29. What is a Work Flow?
A Work Flow is a set of instructions on how to execute tasks such as sessions, emails and shell commands. A WorkFlow is created from Workflow Manager.
30. What are actions which can be performed by pmcmd command?
We can perform the following actions with pmcmd:
- Check whether the Informatica server is running
- Start and stop sessions and batches
- Recover sessions.
- Stop the Informatica server.
pmcmd returns zero on success and non zero on failure
31. What is commit interval?
A commit interval is the interval at which the Informatica Server commits data to relational targets during a session.
32. What is the use of Stored Procedure Transformation?
We use the Stored Procedure Transformation for populating and maintaining the database.
33. What is the use of partitioning the sessions?
The partitioning of session increases the session performance by reducing the time period of reading the source data and loading the data into the target.
34. What is the uses of Lookup Transformation?
The Lookup Transformation is useful for:
- Getting a related value form a table using a key column value
- Update slowly changing dimension table
- To check whether records already exists in the table.
35. What is a Parameter File?
The parameter File is used to define the values of the parameters and variables used in a session. It is a file created in a notepad and saved with .prm extension.
36. What is Metadata Reporter?
It is a web based application that enables you to run reports against the repository metadata. With a metadata reporter you can access information about your repository without having knowledge of SQL.
37. What is meant by Lookup Cache?
The Informatica server builts a cache in memory when it process the first row of a data in a cached lookup transformation.
38. What is a Load Manager?
The Load Manager is a primary Informatica Server process. It performs the following tasks:
- Manages sessions and batch scheduling
- Locks the session and read the session properties
- Read the parameter file
- Expand the server and session variables and parameters.
- Verify permissions and privileges
39. What are the tasks performed by Sequence Generator Transformation?
- Create keys
- Replace missing values
- Cycle through a sequential range of numbers.
40. What is the end value of the Sequence Generator?
The end value of the Sequence Generator is 2147483647.
41. What are variables supplied by the Transaction Control Transformation?
- TC_ ROLLBACK_BEFORE
- TC_CONTINUE_TRANSACTION [Default]
42. How to implement Update Strategy?
To implement Update Strategy Transformation the source and target table should have primary keys to compare the records the records and to find out the latest changes happened.
43. What are constants of Update Strategy Transformation?
The constants of Update Strategy Transformation are:
- DD_INSERT – 0
- DD_UPDATE – 1
- DD_DELETE – 2
- DD_REJECT – 3
DD Stands For Data Driven
44. What are the benefits of Star Schema Design?
- Fewer tables
- Designed for analysis across time
- Simplify joins
- Less database space
- Supports drilling on reports
45. What is Data Scrubbing?
The Data Scrubbing is the process of cleaning up the junk in the legacy data and make it accurate and useful. Simply, making good data out of bad data.
46. What are Bad Rows (Rejected Rows)?
The Informatica Server will dumped the bad or rejected rows which are sent out by the transformation into a text file with tablename.bad extension.
47. What are types of Dimensional Modeling?
- Conceptual Modeling
- Physical Modeling
- Logical Modeling
48. What is Forward Engineering?
Using the Erwin tool the data modeler will convert the .SQL script (logical structure of tables) into a physical structure tables at the database level, this is called as Forward Engineering.
49. What is common use of creating a Factless Fact Table?
The most common use of creating a Factless fact table is to capture date transaction events.
50. What are the different sources of Source systems of Data Warehouse?
- Flat Files
- XML Files
- SAP R/3
- SAP BW
- Web Methods
- Web Services
- Cobol Files
- Legacy Systems
51. What are the Session Partitions types?
- Hash keys
- Key range
- Database partitioning
52.While importing source definition the metadata that will be imported are:
- Source Name
- Database Location
- Column Names
- Data Types
- Key Constraints
53. We can stop the Batch by two ways:
- Workflow Monitor
- By pmcmd command
54. What is stop the Batch and types of Batches?
Grouping of sessions is known as Batch. There are two types of batches:
55. What is a tracing level and types of Tracing level?
Tracing level represents the amount of information that Informatica server writes in a log file. Types of Tracing levels are:
- Verbose lnit
- Verbose Data
56. What is the default join that source qualifier provides?
57. Types of Slowly Changing Dimensions:
- Type – 1 (Recent updates)
- Type – 11 (Full historical information)
- Type – 111 (Partial historical information)
58. What are Update Strategy’s target table options?
- Update as Update: Updates each row flagged for update if it exists in the table
- Update as Insert: Inserts a new row for each update.
- Update else Insert: Updates if row exists, else inserts.
59. What does a Mapping document contains?
The Mapping document contains the following information :
- Source Definition – from where the database has to be loaded
- Target Definition – to where the database has to be loaded
- Business Logic – what logic has to be implemented in staging area.
60. What does the Top Down Approach says?
The Top Down Approach is coined by Bill Immon. According to his approach he says “First we need to implement the Enterprise data warehouse by extracting the data from individual departments and from the Enterprise data warehouse develop subject oriented databases called as “Data Marts”.
61. What does the Bottom Up Approach or Ralph Kimball Approach says?
The Bottom Down Approach is coined by Ralph Kimball. According to his approach he says “First we need to develop subject oriented database called as “Data Marts” then integrate all the Data Marts to develop the Enterprise data warehouse.
62. Who is the first person in the organization to start the Data Warehouse project?
The first person to start the Data Warehouse project in a organization is Business Analyst.
63. What is a Dimension Modeling?
A Dimensional Modeling is a high level methodology used to implement the start schema structure which is done by the Data Modeler.
64. What are the types of OLAPs ?
- DOLAP: The OLAP tool which words with desktop databases are called as DOLAP. Example: Cognos EP 7 Series and Business Objects, Micro strategy.
- ROLAP: The OLAP which works with Relational databases are called as ROLAP. Example: Business Object, Micro strategy, Cognos ReportNet and BRIO.
- MOLAP: The OLAP which is responsible for creating multidimensional structures called cubes are called as MOLAP. Example: Cognos ReportNet.
- HOLAP: The OLAP which uses the combined features of ROLAP and MOLAP are called as HOLAP. Example Cognos ReportNet.
65. What is the extension of Repository backup?
The extension of the Repository backup is .rep
66. Which join is not supported by Joiner Transformation?
The non-equi joins are not supported by joiner Transformation.
67. What is SQL Override?
Applying the joining condition in the source qualifier is called as sql override.
68.What is Rank Index?
When you create a Rank Transformation by default “Rank Index” port will be created, to store the number of ranks specified.
69. What is Sort Key?
The column on which the sorting takes place in the Sorter Transformation is called as “Sort Key” Column.
70. What is default group in Router Transformation?
In the Router Transformation the rejected rows are captured by default group and the data will be passed to target table.
71. What is unconnected Transformation?
The transformation which does not involve in mapping data flow is called as Unconnected Transformation.
72. What is Connected Transformation?
The Transformation which involve in mapping data flow is called as connected transformation. By default all the transformation are connected transformation.
73. Which Transformation is responsible to maintain updates in warehouse database?
Update Strategy Transformation.
74. What are the caches contained by the Look up Transformation?
- Static Lookup cache
- Dynamic Lookup Cache
- Persistent Lookup Cache
- Data cache
- Index cache
75. What are the Direct and Indirect methods in the Flat file extraction?
In the direct method the extract the flat file by using its own meta data. In indirect method we extract all the flat files by using one flat file’s meta data.