Data2Ads: from customer data to marketing slogans
This project was developed in Python and deployed on Streamlit as an interactive web application and its main goal was to segment customers into behavioral personas based on their purchase data and to use AI to automatically generate cluster descriptions, product photo analyses, and marketing slogans tailored to each group.
The application was created in Polish and designed to guide the user step-by-step through data upload, segmentation, and content generation. It demonstrates a strong integration of machine learning (KMeans, Silhouette Score) and generative AI (OpenAI GPT-4o-mini), showcasing how data-driven segmentation can directly support personalized marketing automation.
Project realized as part of the Data Scientist Certificate
Technologies: Python, Streamlit, Scikit-learn, PyCaret, OpenAI API
Language: Polish
Available at: https://slogans.streamlit.app/
Technical Highlights
-
Machine Learning: KMeans clustering with Silhouette Score for segment optimization.
-
Automation Framework: PyCaret for clustering workflows.
-
AI Integration: OpenAI GPT-4o-mini for persona descriptions, photo analysis, and slogan generation.
-
User Interface: Streamlit app with structured tabs for a step-by-step workflow.
-
File Handling: CSV and Excel file validation, upload, and download functions.
Objective
The primary objective is to:
1. Group customers into personas according to dominant purchased products and colors.
2. Generate AI-based descriptions for each persona (including shopping behavior, preferred colors, and product types).
3. Upload and analyze product photos, automatically generating descriptions emphasizing visual characteristics such as main color.
4. Produce marketing slogans optimized for Google Ads and Meta Ads, reflecting the interests of each customer segment.
Application Structure
The Streamlit app is divided into five functional tabs, each representing a stage of the data and content generation process:
-
The user downloads an Excel template with predefined columns and fills it with customer and purchase data.
-
The file structure is validated on upload — any missing or incorrect columns result in immediate rejection.
-
This ensures clean, consistent data input before further processing.
2️⃣Data Segmentation (Segmentacja danych)
-
Once the file is approved, the app calculates the optimal number of clusters using KMeans and Silhouette Score from Scikit-learn.
-
After determining the best number of clusters, the user can perform segmentation using PyCaret, NumPy, Pandas, and Datetime.
-
The segmented dataset is displayed in a table and can be downloaded as a CSV file for further analysis.
At this stage, a connection is made with OpenAI API to automatically name and describe each customer cluster.
-
-
-
Libraries used:
json,io,pandas,openai,dotenv. -
The GPT-4o-mini model is prompted in Polish with strict guardrails to ensure data accuracy:
-
No invented products or colors.
-
The number of clusters must match the calculated value.
-
Each description must include shopping behavior, preferred payment methods, product types, and colors.
-
-
Generated cluster names and descriptions can be downloaded as a CSV file.
-
-
- The user uploads a product image related to one of the identified segments (e.g., item type or dominant color).
- The app uses PIL (Python Imaging Library) to process the photo, then calls OpenAI to generate a concise product description in Polish, explicitly mentioning the product’s main color.
-
-
-
-
Supporting libraries:
time,base64,dotenv, andopenai.
-
-
-
- Using the previously identified item type and dominant color, the app generates targeted marketing slogans for Google Ads and Meta Ads.
- The slogans are personalized for the corresponding customer clusters, ensuring alignment between product visuals and audience preferences.
- This automation allows for quick adaptation of marketing copy based on data insights
-
A dedicated section explaining how to prepare the dataset, configure the OpenAI API key, and interpret generated outputs.
-
The user must use their own OpenAI API key, which can be obtained from https://platform.openai.com
Outcome
The project successfully demonstrates how data science and AI can combine to create a fully automated marketing intelligence tool. It transforms raw purchase data into actionable insights — segmenting customers, analyzing visual content, and producing creative marketing materials within one workflow.
By integrating machine learning models, image analysis, and generative AI, this project showcases advanced practical applications of data science techniques in the field of personalized marketing.