Python programming can be used to perform the ETL process, ETL Process involves collecting data from various data sources, transforming the data, and loading it to a target system such as another database or even to a file. We are going to explain in-depth how we can use Python scripts to extract data from an oracle database to an excel file.
python script to be in a position to perform the ETL process for the oracle database there are Libraries that are required which include
cx_Oracle used to connect to the Oracle database and query data, openpyxl to export data, and pandas libraries
Oracle Client needs to have been installed on the machine before installing cx_oracle, also ensure you run pandas and openpyxl below is the script to run.
Installing cx_Oracle,Pandas and openpyxl
!pip install cx_Oracle pandaspip install openpyxl
Example of ETL Python Script
Here is an example of a Python script that uses the cx_Oracle library to connect to an Oracle database, read data from a table, and export the data to an Excel file using the openpyxl library:
import cx_Oracle import openpyxl # Connect to the Oracle database connection = cx_Oracle.connect("username", "password", "hostname/service_name") cursor = connection.cursor() # Execute a SELECT statement to retrieve data from the database query = "SELECT * FROM myowntable" cursor.execute(query) # Fetch all the rows as a list of tuples rows = cursor.fetchall() # Create a new Excel workbook workbook = openpyxl.Workbook() # Get the active worksheet worksheet = workbook.active # Add the column headers to the worksheet column_headers = [i for i in cursor.description] worksheet.append(column_headers) # Add the rows to the worksheet for row in rows: worksheet.append(row) # Add filters to the column headers worksheet.auto_filter.ref = "A1:Z1" # Save the workbook to an Excel file workbook.save("export.xlsx") # Close the cursor and connection cursor.close() connection.close() print("Data exported successfully!")
The above Python script functionalities include:
- Retrieve data from the specified table in the Oracle table and store it in Pandas DataFrame.
- Connect to an Oracle SQL database
- Add the column headers from the table to the first row of the Excel worksheet
- Add the data rows to the worksheet
- Add filters to the column headers
- Save the workbook to an Excel file called export.xlsx
- Close the cursor and connection
- Print a message “Data has been exported successfully“.
Before deploying the Script to run ensure you replace this item (username, password, hostname/service_name) with your oracle database credentials and “myowntable” with the name of the table you want to read from.