Radimir Dedecek

Technology Stack

Python
REST API
Pandas
Numpy
Scikitlearn
Tensorflow
C / C++
Docker & K8s
Linux
Git
Grafana
Bootstrap

Professional Overview

I am a Full-Stack Developer specializing in high-performance backends and scalable cloud infrastructure. This single-page application serves as my interactive portfolio, built to demonstrate hands-on competence across my primary technology stack, from modern front-end design to robust back-end development principles. A hybrid architecture is implemented here, that uses local K8s-optimized TensorFlow models in development and fails over to Cloud Vision APIs in serverless production to optimize latency and cost.

Languages & Core

Python: Flask, PEP 8, Scripting
C / C++: Performance, Modern Standards
JavaScript: ES6+, DOM, Async/Fetch
SQL: Relational Database Design
Bash: Shell Scripting & Automation

Web & Data

Frameworks: Flask, Jinja2, Bootstrap 5
UI/UX: Responsive & Adaptive Design
AI/ML: TensorFlow, Keras, Scikit-learn
Analysis: NumPy, Pandas, Matplotlib
APIs: REST, JSON Serialization

DevOps & Systems

Orchestration: Docker, Kubernetes
Cloud: Google Cloud Platform (GCP)
VCS: Git (Branching, PRs, Workflows)
OS: Linux (Ubuntu/Debian) Admin
Environment: Virtualenv, PIP, OCI Images

I write clean, PEP 8 compliant code and leverage Linux environments for all development and deployment tasks.

What You Will Find Here:

**Core Technologies:** Practical examples and explanations of my skills in **Python**, **Bootstrap 5**, and data science tooling.
**Development Process:** Evidence of a workflow using **Linux** and **Git**, demonstrating efficient software development practices.

Use the menu on the left or the main header links to quickly navigate through the skills and project examples.

Python

Python is the primary language used for backend development and data processing, focusing on clean, readable code and robust error handling. Proficiency in object-oriented programming (OOP), efficient data structure handling (comprehensions), decorators, ... Python serves as the primary language for my data science and backend development stack. Python is utilized across all major components of this portfolio: managing the routing via Flask, processing external data in the API section, and performing complex calculations in the Pandas and future data science sections.

Interactive Demo: Python Performance (3 x Fibonacci) & Testing

This demo utilizes the server-side Python environment (Flask) to showcase two critical aspects of Python development: **Performance Optimization** and **Code Quality**. The first test compares the execution time of three Fibonacci functions (slow recursive, memoized recursive, and iterative loop) to demonstrate optimization needs and the practical use of advanced language features like a **memoization decorator** to drastically improve performance. The second test allows you to run a **pytest** unit test suite against the backend logic, confirming proficiency in writing and executing tests to ensure code reliability and correctness.

Server-Side Python Code Snapshot: api.py

            
######################################################################################
#                      Python Performance Test (3 x Fibonacci)                       #
######################################################################################
from functools import wraps
from time import time
      
N_VALUE = 13 # target Fibonacci number Fn where n = 33

# --- Helper Function ---
def duration(dur): 
    """Formats time duration into [s ms µs] string."""
    s  = int(dur)
    ms = int((dur-s) * 1000)
    mu = int((dur-s-ms/1000) * 1000 * 1000)
    return f"{s:2}s {ms:3}ms {mu:3}μs"

# --- Decorators --- 
def accelerator(func):
    """Decorator that caches a function's results."""
    cache = {}
    @wraps(func)
    def wrapper(*args):
        # 1. Check if the result is already known
        n_arg = args[0]
        if n_arg in cache:
            return cache[n_arg], None, None # Return cached result instantly
        # 2. Calculate the result if not cached
        result, run_time, out_str = func(*args)
        # 3. Store the result before returning
        cache[n_arg] = result
        return result, run_time, out_str
    # 4. EXPOSE THE CACHE CLEAR FUNCTION
    def clear_cache():
        """Clears the cache dictionary."""
        nonlocal cache
        cache = {}
    wrapper.clear_cache = clear_cache
    return wrapper

def timer(func):
    """Decorator that measures the duration of a function."""
    # Initialize a flag on the function object to track if we are in the timing context
    func._timing_context = {'in_progress': False}     
    @wraps(func)
    def wrapper(*args, **kwargs):
        # Check if timing is already in progress for this function
        if func._timing_context['in_progress']:
            return func(*args, **kwargs)
        else:
            func._timing_context['in_progress'] = True
            start = time()
            try:
                # Execute the function (this will trigger all the recursive calls)
                result_tuple = func(*args, **kwargs)
            finally:
                # Ensure the flag is reset even if the function raises an error
                func._timing_context['in_progress'] = False
            # Print the total time only once
            result = result_tuple[0]
            run_time = time() - start
            # comparison_time_d1 = args[1]
            str1= " -> " + "   base time for speed comparison"
            if args[1]>0:
                str1= f" -> {round(args[1]/run_time):8,} times faster calculation"
            out_str = f"Finished: {func.__name__}({args[0]}) = {result}, run_time: {duration(run_time)}{str1}"
            return result, run_time, out_str
    return wrapper

@timer 
def slow_fibonacci(n, d1): # This simulates a slow, expensive calculation without accelerator
    if n < 0:
        return 0, None, None
    if n <= 1:
        return n, None, None
    return slow_fibonacci(n - 1, d1)[0] + slow_fibonacci(n - 2, d1)[0], None, None

@accelerator
@timer 
def fast_fibonacci(n, d1): # This simulates a slow, expensive calculation with accelerator
    if n < 0:
        return 0, None, None
    if n <= 1:
        return n, None, None
    return fast_fibonacci(n - 1, d1)[0] + fast_fibonacci(n - 2, d1)[0], None, None

@timer    
def loop_fibonacci(n,d1): # This simulates a loop calculation without accelerator
    f2=0
    f1=1
    if n<1:
        return 0, None, None
    for _ in range(n-1):
        f1 = f2 + f1
        f2 = f1 - f2
    return f1, None, None

# ####################################################### TESTING AREA ####################################################### # 
# import re
# N_VALUE = 1
# fib_result1, total_time1, str1 = slow_fibonacci(N_VALUE, 0)           # slow recursive fibonacci calculation without accelerator
# fib_result2, total_time2, str2 = fast_fibonacci(N_VALUE, total_time1) # slow recursive fibonacci calculation with accelerator
# fib_result3, total_time2, str3 = loop_fibonacci(N_VALUE, total_time1) # loop fibonacci calculation 
# print(f"{re.sub(r"<.*?>", "", str1)}\n{re.sub(r"<.*?>", "", str2)}\n{re.sub(r"<.*?>", "", str3)}")

Fibonacci number Fn, enter n (e.g., 20 or 35 for slower results):

Server-Side Python Code Snapshot: api_test.py

            
######################################################################################
#         Python api_test.py -  automated testing using the pytest framework         #
######################################################################################
import pytest
import time
from fibonacci import fast_fibonacci, slow_fibonacci, loop_fibonacci 

# --- Constants for Tests ---
TEST_N = 10 
EXPECTED_RESULT = 55 # F(10) is 55
N_PERF = 30 # Used for caching performance test

# --- Fixture to ensure cache is clear before performance tests ---
@pytest.fixture(autouse=True)
def clear_fib_cache():
    """Clears the cache for fast_fibonacci before running tests."""
    try:
        # Note: We rely on the decorator adding the clear_cache method
        fast_fibonacci.clear_cache()
    except AttributeError:
        # Pass silently if the decorator structure is modified or cache isn't available
        pass

# --- Correctness Tests ---
def test_01_slow_recursive_correctness():
    """Tests the Naive Recursive function (O(2^n))."""
    # Functions return (result, time, string). We only care about the result [0].
    result, _, _ = slow_fibonacci(TEST_N, 0)
    assert result == EXPECTED_RESULT, f"Slow recursive failed: Expected {EXPECTED_RESULT}, got {result}"

def test_02_fast_recursive_correctness():
    """Tests the Memoized Recursive function (O(n))."""
    result, _, _ = fast_fibonacci(TEST_N, 0)
    assert result == EXPECTED_RESULT, f"Fast recursive failed: Expected {EXPECTED_RESULT}, got {result}"

def test_03_loop_iterative_correctness():
    """Tests the Iterative Loop function (O(n))."""
    result, _, _ = loop_fibonacci(TEST_N, 0)
    assert result == EXPECTED_RESULT, f"Loop iterative failed: Expected {EXPECTED_RESULT}, got {result}"

# --- Performance Test ---
def test_04_fast_recursive_performance():
    """
    Tests if the fast_fibonacci function is significantly faster on a second run 
    (proving the O(1) cache hit).
    """
    # 1. First run (populates cache)
    start_time_1 = time.time()
    fast_fibonacci(N_PERF, 0)
    duration_1 = time.time() - start_time_1
    
    # 2. Second run (cache hit)
    start_time_2 = time.time()
    fast_fibonacci(N_PERF, 0)
    duration_2 = time.time() - start_time_2
    
    # Assert the second run is much faster (e.g., at least 10x faster)
    assert duration_2 < duration_1 * 0.1 + 0.001, \
        f"Cache failed: Second run ({duration_2:.6f}s) was not significantly faster than first ({duration_1:.6f}s)."

# --- Edge Case / Validation Test ---
# We use parametrize to run the same test logic with different inputs/expected outcomes
@pytest.mark.parametrize("n_input, expected_result", [
    (-2, 0),   # F(-2) should be 0
    (0, 0),    # F(0)  should be 0
    (1, 1),    # F(1)  should be 1
    (2, 1),    # F(2)  should be 1
    (13, 233)  # F(13) should be 233
])
def test_05_boundary_correctness(n_input, expected_result):
    """Test boundary and small inputs for loop function."""
    result, _, _ = loop_fibonacci(n_input, 0)
    assert result == expected_result, f"F({n_input}) failed. Expected {expected_result}, got {result}"

def test_06_negative_input_loop():
    """Tests that the improved loop_fibonacci correctly returns 0 for n < 1."""
    # Based on your fix: If n < 1, the function should return 0
    result, _, _ = loop_fibonacci(-5, 0)
    assert result == 0, "Loop function did not handle negative input correctly (expected 0)."

#  command line:
#  pytest test_fibonacci.py

Automated testing using the pytest framework:

TERMINAL OUTPUT:

        Click the "RUN fibonacci.py" button above to execute the server-side Python function and see the performance
        results here.
        or
        Click the "RUN test_fibonacci.py" button above to execute the server-side Pytest framework and see the test
        results here.
    

Technical Toolkit: Python Ecosystem

A comprehensive index of libraries and frameworks utilized across my web development, automation, and data science projects.

Automation & Scraping

Selenium WebDriver
Beautiful Soup 4
Requests API

Data Science & Viz

Plotly / Matplotlib
Seaborn
Scikit-learn

Backend & Database

SQLAlchemy
WTForms
Pytest

REST API

Test Project: External API Consumption & Data Pipeline Foundation

This section demonstrates proficiency in the critical first step of any data science pipeline: **reliable and efficient data acquisition** from external sources, paired with **cloud-native persistence**. It use Python's Flask backend to consume the external **The Movie Database (TMDB) API** and then serve the processed data to the frontend.

This setup implements a robust **Cache-or-Fetch** pattern to optimize performance and respect API rate limits. The caching layer utilizes **Google Cloud Firestore**, demonstrating skills in building fast, scalable, and cloud-persistent data solutions.

Focus Area: TMDB Data Acquisition

To source a dynamic, real-world dataset, the **The Movie Database (TMDB) API** was utilized. This requires proficiency in:

**API Backend Design:** Building a clean Flask endpoint (/api/fetch_movies) to serve processed JSON data.
**REST API Consumption:** Experience using the requests library for authenticated external API interaction.
**Cloud Persistence (Firestore):** Implementing the **Cache-or-Fetch** pattern using a live NoSQL database (Firestore) instead of local files for better scalability and real-time status feedback.
**Request Handling:** Using the requests library to handle rate-limited API calls and pagination for large datasets.
**JSON Processing:** Efficiently parsing and flattening complex JSON structures containing nested lists (e.g., genre lists, production companies).
**Data Transformation:** Preparing raw data into a structured format suitable for further analysis (e.g., in the Pandas section).

Server-Side Code Snapshot (Python / Rest API Implementation / Cache Check Logic):

# ---------------------------------------------------------------------------------- #  
# ---------------------------- FIRESTORE INITIALIZATION ---------------------------- #
# ---------------------------------------------------------------------------------- # 
db_firestore_client = None 
if api_utils.FLASK_ENV != "USE_LOCAL_FILE_FOR_TESTING":
    if not firebase_admin._apps:                             # Prevent "App already exists" error
        try:
            if api_utils.FLASK_ENV == "PRODUCTION":
                print("Firestore: Initializing in PRODUCTION mode (No JSON key needed).")
                firebase_admin.initialize_app()              # GCP Cloud Run Path: Uses Service Identities automatically
            else:
                # Local/Development Path: Uses the serviceAccountKey.json
                print(f"Firestore: Initializing in DEVELOPMENT mode using: {api_utils.GOOGLE_APPLICATION_CREDENTIALS}")
                cred = credentials.Certificate(api_utils.GOOGLE_APPLICATION_CREDENTIALS)
                firebase_admin.initialize_app(cred)          # Initialize the app 
            print("SUCCESS: Firestore client initialized.")
            db_firestore_client = firestore.client()         # Get the Firestore client instance
        except Exception as e:
            print(f"FATAL ERROR during Firestore initialization. Firestore will be disabled: {e}")

######################################################################################
###                    REST API section AJAX endpoint                              ###
###            Exposes a RESTful endpoint to the frontend                          ###
###   It calls the core data loading function and returns a clean subset of data   ###
######################################################################################
@app.route('/api/fetch_movies', methods=['POST'])
def fetch_movies_endpoint():

    # Load 1 page (20 movies) for the cache, but only return top 5
    data, cache_status = api_utils.load_movie_data(6, db_firestore_client)   
    if data.get('error'):
        return jsonify({"movies": [], "cache_status": cache_status, "error": data.get('error')}), 500

    # Process movie list (Limit to top 5)
    movie_list = []
    for i, result in enumerate(data.get('results', [])):
        if i >= 5: # Limit to top 5 for display
            break
        movie_list.append({
            "title": result.get('title'),
            "score": f"{result.get('vote_average'):.1f}", # Format score to 1 decimal
            "votes": result.get('vote_count'),
            "release_date": result.get('release_date'),
            "year": int(result.get('release_date', '0000').split('-')[0]) # Correctly extracts the year
        })
    # Return the full JSON response, including the new cache_status
    return jsonify({
        "movies": movie_list, 
        "cache_status": cache_status, 
        "total_cached": len(data.get('results', [])),
        "expiration": api_utils.CACHE_TTL_MINUTES/60
    })

# ---------------------------------------------------------------------------------- #
# ---                            FIRESTORE LOAD DATA                             --- #
# --- Reads the cached movie data and timestamp from the Firestore document.     --- #
# --- for testing "USE LOCAL FILE FOR TESTING" Reads the movie data from local file. #
# ---------------------------------------------------------------------------------- #
def load_TMDB_data(db_client):
    if FLASK_ENV == "USE_LOCAL_FILE_FOR_TESTING":                         # testing - Fetch from Local File
        if os.path.exists(TMDB_DATA_FILE):
            try:
                with open(TMDB_DATA_FILE, 'r', encoding='utf-8') as f:
                    data = json.load(f)
                    status_msg = f"{FLASK_ENV}: Fetched from local file TMDB_DATA_FILE: {TMDB_DATA_FILE}"
                    print(status_msg)
                    return data.get('results', []), status_msg
            except Exception as e:
                status_msg = f"{FLASK_ENV}: Failed to fetch data from local file TMDB_DATA_FILE: {TMDB_DATA_FILE}"
                print(status_msg)
        else:
            status_msg = f"{FLASK_ENV}: Failed to fetch data. local file not exists TMDB_DATA_FILE: {TMDB_DATA_FILE}"
            print(status_msg)
        return None, status_msg
        
    if db_client is None:                                                        # Fetch from Firestore
        print("Cloud read skipped: Firestore client is None.")
        return None, "Firestore not initialized."
    try:
        doc_ref = db_client.collection(FIRESTORE_COLLECTION).document(FIRESTORE_DOCUMENT)
        doc = doc_ref.get()
        if not doc.exists:
            print("Cloud cache document not found.")
            return None, "Cloud cache empty."
        data_entry = doc.to_dict()
        last_fetch = data_entry.get("timestamp")
        now_utc = datetime.now(timezone.utc)
        if last_fetch:
            # Firestore timestamps are usually timezone-aware datetime objects
            if not last_fetch.tzinfo:
                # If timezone is missing, assume UTC for local comparison
                last_fetch = last_fetch.replace(tzinfo=timezone.utc)
            # Check if the cache is expired
            if last_fetch + timedelta(minutes=CACHE_TTL_MINUTES) > now_utc:
                status_msg = f"SUCCESS: Data loaded from REAL Firestore Cache (Last updated: {last_fetch.strftime('%Y-%m-%d %H:%M:%S UTC')})."
                print(status_msg)
                return data_entry.get("data"), status_msg   
            else:
                status_msg = f"REAL Firestore Cache expired. Last fetch: {last_fetch.strftime('%Y-%m-%d %H:%M:%S UTC')}"
                print(status_msg)
                return None, status_msg
        else:
             status_msg = "Firestore document found, but timestamp field is missing or invalid."
             print(status_msg)
             return None, status_msg
    # Use specific GCP exceptions for better clarity
    except gcp_exceptions.PermissionDenied as e:
        status_msg = f"Error reading from Firestore (Permission Denied): {e}"
        print(status_msg)
        return None, status_msg
    except Exception as e:
        status_msg = f"Generic Error reading from Firestore: {e}"
        print(status_msg)
        return None, status_msg

# ---------------------------------------------------------------------------------- #
# ---                     SAVE TMDB DATA TO FIRESTORE CLOUD                      --- #
# --- Writes the movie data and the current timestamp to the Firestore document.---- #
# ---------------------------------------------------------------------------------- #
def save_TMDB_data_to_cloud(data, db_client):
    if db_client is None:
        status_msg = f"--- Cloud write skipped: Firestore client is None/not initialized."
        print(status_msg)
        return status_msg
    print("Writing data to REAL Firestore Persistence layer...")
    data_to_store = {
        # IMPORTANT: Use timezone.utc for consistency
        "timestamp": datetime.now(timezone.utc), 
        "data": data 
    }
    try:
        doc_ref = db_client.collection(FIRESTORE_COLLECTION).document(FIRESTORE_DOCUMENT)
        doc_ref.set(data_to_store) 
        status_msg = "SUCCESS: Data written to Firestore."
        print(status_msg)
        return status_msg
    # Use specific GCP exceptions for definitive error reporting
    except gcp_exceptions.PermissionDenied as e:
        status_msg = f"--- FIRESTORE WRITE FAILED (Permission Denied): Check Rules --- Error: {e}"
        print(status_msg)
        return status_msg
    except Exception as e:
        status_msg = f"--- FIRESTORE WRITE FAILED (Unknown Error) --- Error: {e}"
        print(status_msg)
        return status_msg

# ---------------------------------------------------------------------------------- #      
# ---                   TMDB Data Loading (Cache-or-Fetch Logic)                ---- #
# --- The main data fetching function:                                          ---- #
# --- Fetch tmdb_data from FIRESTORE or TMDB API or Local File                  ---- #
# ---------------------------------------------------------------------------------- #
def load_movie_data(num_pages=1, db_client=None):
    tmdb_data, status_message = load_TMDB_data(db_client)                # Fetch from FIRESTORE or Local File
    if tmdb_data:
        return tmdb_data, status_message
    if FLASK_ENV == "USE_LOCAL_FILE_FOR_TESTING":               # testing - Fetch from Local File
        return None, status_message

    print(f"No valid cache found. Fetching from TMDB API...")   # Fetch from external API (TMDB)
    all_results = []
    for page in range(1, num_pages + 1):    # Fetch movie data (1 page = 20 movies)
        url = f"{BASE_URL}{page}"
        try:
            response = requests.get(url, headers=headers) 
            response.raise_for_status()     # Raises an HTTPError for bad responses (4xx or 5xx)
            data = response.json()
            all_results.extend(data.get('results', []))
            print(f"Fetched page {page}. Total movies: {len(all_results)}")
            if len(all_results) >= 120:     # Stop after 120 movies (6 page)
                break
        except requests.exceptions.RequestException as e:
            error_msg = f"Failed to fetch data from TMDB: {e}"
            print(error_msg)                # Return empty data and the error message
            return {"results": [], "error": error_msg}, error_msg 

    if all_results:                         # Save tmdb_data, return fetched data and status
        data_to_store = {"results": all_results}
        write_status = save_TMDB_data_to_cloud(data_to_store, db_client) 
        return data_to_store, f"TMDB API Fetched and cache written: {write_status}"
    else:
        print(f"Error: Save to Cloud Cache [all_results] is empty")   
                      
    return {"results": []}, status_message  # Return empty data and the last status message

Interactive Demo: Fetch Top Movies

Click the RUN button above to execute the server-side API fetch logic and see the data.

Key Competencies Demonstrated:

**Request Handling:** Secure, server-side interaction with external APIs.
**Cloud Caching:** Utilizing Firestore for persistence, reducing latency and reliance on the external API.
**Web Backend/Flask:** Implementing a custom RESTful API endpoint.

Pandas & Data Analysis

Test Project: Data Cleaning, Transformation, and Aggregation

This section showcases the use of the **Pandas** library for data science essentials: loading raw data from the cloud cache, performing complex **data transformation**, and generating **meaningful analytical insights**. The data for processing is downloaded in the REST API section.

The analysis applies a well-known industry standard, the **Weighted Rating Formula (similar to IMDB's)**, to the raw TMDB scores. This demonstrates the ability to implement statistical models in Python to derive a more accurate and robust metric than a simple average score.

Key Analytical Tasks:

**Data Loading:** Ingesting complex JSON data (from Firestore via Flask) directly into a Pandas DataFrame.
**Transformation:** Cleaning date fields, calculating the 75th percentile for vote counts, and applying the Weighted Score formula.
**Aggregation:** Grouping data by year and calculating mean scores to find trends.
**Filtering & Sorting:** Identifying the top-rated movies based on the calculated Weighted Score.

Interactive Demo: Run Pandas Analysis

Click the RUN button above to process the TMDB data using Pandas on the backend.

Server-Side Code Snapshot (Python / Pandas Implementation):

######################################################################################
###                              PANDAS endpoint                                   ###
###   Retrieves TMDB data, processes it using Pandas, and returns summary stats.   ###
###      Demonstrates data loading, cleaning, transformation, and aggregation.     ###
###################################################################################### 
@app.route('/api/get_pandas_data', methods=['GET'])
def get_pandas_analysis():
    raw_data, status_message = api_utils.load_movie_data(6, db_firestore_client)  # Load the raw data (Cache-or-Fetch logic)
    df, C, m = api_utils.process_data_for_analysis(raw_data) # cleans, calculates C, m and adds the  weighted_score column
    if df.empty:
        return jsonify({"error": "Failed to load movie data for Pandas analysis.", "source_status": status_message}), 500
    qualified_movies = df[df['vote_count'] >= m].copy() 
    qualified_movies['weighted_score'] = qualified_movies.apply(lambda row: api_utils.weighted_rating(row, m, C), axis=1)
    
    # Top 5 Best Rated Qualified Movies (by Weighted Score)
    top_movies_weighted = qualified_movies.sort_values(by='weighted_score', ascending=False).head(5)
    
    # Simple Summary Statistics
    stats = {
        'mean_score': f"{df['vote_average'].mean():.2f}",
        'median_votes': f"{df['vote_count'].median():,.0f}",
        'total_movies_analyzed': len(df),
        'min_votes_for_qualification': f"{m:,.0f}"
    }

    # Aggregation by Year (Top 3 Recent Years with Highest Average Score)
    yearly_avg = df.groupby('release_year')['vote_average'].mean().reset_index()
    yearly_avg = yearly_avg.sort_values(by=['release_year', 'vote_average'], ascending=[False, False]).head(3)

    # Convert DataFrames to JSON structures - Prepare Final JSON Response
    top_movies_json = top_movies_weighted[[
        'title', 'vote_average', 'vote_count', 'release_year', 'weighted_score'
    ]].to_dict(orient='records')
    
    yearly_json = yearly_avg.to_dict(orient='records')
    return jsonify({
        "status": "SUCCESS",
        "source_status": status_message,
        "summary_stats": stats,
        "top_movies_weighted": top_movies_json,
        "top_years": yearly_json
    })

NumPy Data Preprocessing

Test Project: NumPy Data Preprocessing

This section showcases the use of the **NumPy** library for fast data manipulation: specifically showing two common preprocessing techniques: Z-Score Standardization and Min-Max Normalization applied to movie scores. The data for processing is downloaded in the REST API section and cleaned in the PANDAS section.

Interactive Demo: Run NumPy Analysis (Standardization & Normalization)

Click the button to run NumPy Analysis on the TMDB data.

Server-Side Code Snapshot (Python / NumPy Implementation):

######################################################################################
###                                  NUMPY ENDPOINT                                ###
### Standardization (Z-score) and Normalization (Min-Max Scaling) on movie scores. ###
######################################################################################
@app.route('/api/run_numpy_analysis', methods=['GET'])
def run_numpy_analysis():
    raw_data, status_message = load_movie_data(6)
    if not raw_data or not raw_data.get('results'):
        return jsonify({"error": f"Failed to load data for Plotly. Status: {status_message}"}), 500
    try:
        # 1. Prepare Data
        df = pd.DataFrame(raw_data['results'])

        # Filter for numerical stability
        df = df[df['vote_average'].notna() & (df['vote_count'] > 0)].copy()
        
        # Extract the relevant series (Score) into a NumPy array
        scores = df['vote_average'].to_numpy()
        movie_titles = df['title'].tolist()

        if len(scores) < 5:
            return jsonify({"error": "Not enough data points for NumPy analysis."}), 500

        # 2. Standardization (Z-Score): z_scores = (scores - mean) / std_dev
        mean = np.mean(scores)
        std_dev = np.std(scores)
        z_scores = (scores - mean) / std_dev

        # 3. Normalization (Min-Max Scaling): (X - min) / (max - min)
        min_score = np.min(scores)
        max_score = np.max(scores)
        min_max_scores = (scores - min_score) / (max_score - min_score)
        
        # 4. Compile results for the top movies for display
        results = []
        for i in range(NUMPY_DISPLAY_LIMIT):
            if i >= len(scores):
                break
            results.append({
                "title": movie_titles[i],
                "raw": f"{scores[i]:.2f}",
                "z_score": f"{z_scores[i]:.4f}",
                "min_max": f"{min_max_scores[i]:.4f}"
            })
            
        analysis_stats = {
            "mean": f"{mean:.2f}",
            "std": f"{std_dev:.2f}",
            "min": f"{min_score:.2f}",
            "max": f"{max_score:.2f}"
        }

        return jsonify({
            "results": results,
            "stats": analysis_stats
        })
    except Exception as e:
        print(f"Numpy Data Generation Failed: {e}")
        return jsonify({"error": f"Internal Server Error during Numpy data generation: {str(e)}"}), 500

SCIKIT-LEARN & Predictive Modeling

Test Project: Regression Analysis and Model Evaluation

This section demonstrates essential machine learning workflow steps using scikit-learn, a leading library for classical ML algorithms. It use data from TMDB (top 120 movies) to train a model to predict a movie's weighted score. The processed data is downloaded in the REST API section and cleaned in the PANDAS section.

Key ML Concepts Demonstrated:

**Feature Engineering:** Using features like `vote_count` and `popularity` to predict the target variable.
**Train/Test Split:** Dividing data for training and unbiased model evaluation.
**Linear Regression:** Training a simple model to establish linear relationships.
**Model Evaluation:** Calculating key metrics like the $R^2$ Score and Mean Squared Error (MSE).
**Cross-Validation:** Ensuring model robustness across different subsets of the data.

Interactive Demo: Run Scikit-learn Regression

Click the button to train and evaluate the regression model on the TMDB data.

Server-Side Code Snapshot (Python / Scikit-learn Implementation):

######################################################################################
###                       SCIKIT-LEARN ANALYSIS ENDPOINT 2                         ###
###      Generates data for a 3D surface plot to visualize the Weighted Score      ###
###                   function based on Raw Score and Vote Count.                  ###
######################################################################################
@app.route('/api/get_sklearn_plot_data', methods=['GET'])
def get_sklearn_plot_data():
    raw_data, status_message = api_utils.load_movie_data(6, db_firestore_client)
    
    if not raw_data or not raw_data.get('results'):
        return jsonify({"error": f"Failed to load data for Plotly. Status: {status_message}"}), 500
    
    try:
        df = pd.DataFrame(raw_data['results'])
        
        # Calculate C and m from the current dataset (as in the analysis)
        df['vote_count'] = pd.to_numeric(df['vote_count'], errors='coerce')
        df['vote_average'] = pd.to_numeric(df['vote_average'], errors='coerce')
        df.dropna(subset=['vote_count', 'vote_average'], inplace=True)
        df = df[df['vote_count'] > 0]
        
        C = df['vote_average'].mean()
        m = df['vote_count'].quantile(api_utils.TMDB_QUANTILE) 
        
        # 1. Define the range for the two input axes (X and Y)
        # X-axis: Votes (v). Range from 0 to 5 times the min qualified votes (m).
        v_min = 0
        v_max = int(m * 5) + 1 
        votes = np.linspace(v_min, v_max, 50) # 50 points for smoothness
        
        # Y-axis: Raw Score (R). Range from 5.0 to 10.0 (TMDB range).
        R_min = 5.0
        R_max = 10.0
        raw_scores = np.linspace(R_min, R_max, 20) # 20 points
        
        # 2. Create the meshgrid for all combinations
        V, R = np.meshgrid(votes, raw_scores)
        
        # 3. Calculate the Z-axis (Weighted Score) for every point in the grid
        # The formula: (v / (v + m)) * R + (m / (v + m)) * C
        Z = (V / (V + m)) * R + (m / (V + m)) * C

        # 4. Prepare the data for Plotly (must be converted to lists)
        plot_data = {
            "x": V.tolist(),        # Votes (x-axis)
            "y": R.tolist(),        # Raw Score (y-axis)
            "z": Z.tolist(),        # Weighted Score (z-axis)
            "C_constant": f"{C:.3f}",
            "m_constant": f"{m:.0f}"
        }
        return jsonify(plot_data)
    except Exception as e:
        print(f"Plotly Data Generation Failed: {e}")
        return jsonify({"error": f"Internal Server Error during Plotly data generation: {str(e)}"}), 500

# ---------------------------------------------------------------------------------- #      
# ---                        PANDAS & SCIKIT-LEARN                               --- #
# ---            Formula: W = (v / (v + m)) * R + (m / (v + m)) * C              --- #
# ---------------------------------------------------------------------------------- #
def weighted_rating(row, m, C):                   # m = minimum votes required (75th percentile of vote_count)
    v = row['vote_count']                         # v = number of votes for the movie (vote_count)
    R = row['vote_average']                       # R = average for the movie (vote_average)
    return (v / (v + m)) * R + (m / (v + m)) * C  # C = mean vote across the whole report (mean of vote_average)
    
# ---------------------------------------------------------------------------------- #      
# ---                     PANDAS/SKLEARN UTILITY FUNCTION                       ---- #
# --- Processes the raw TMDB movie data (list of dicts) into a Pandas DataFrame, --- #
# --- cleans it, calculates the weighted score formula constants (C and m),      --- #
# --- and adds the final weighted_score column.                                  --- #
# --- Returns tuple: (pd.DataFrame, float C, int m)                              --- #
# ---------------------------------------------------------------------------------- #
def process_data_for_analysis(raw_data):
    if not raw_data or not raw_data.get('results'):
        return pd.DataFrame(), 0.0, 0
    df = pd.DataFrame(raw_data['results'])                              # Convert list of dicts to DataFrame  
    
    # Convert necessary columns to appropriate types and handle NaNs    # Cleaning and Standardization
    df = df.dropna(subset=['vote_average', 'vote_count', 'popularity', 'release_date'])
    df['vote_count'] = df['vote_count'].astype(int)                     # Convert types
    df['vote_average'] = df['vote_average'].astype(float)
    
    # Extract year
    df['release_year'] = pd.to_datetime(df['release_date'], errors='coerce').dt.year
    df.dropna(subset=['release_year'], inplace=True)
    df['release_year'] = df['release_year'].astype(int)

    # Use drop_duplicates with subset='id' to prevent the 'unhashable type: list' error    
    df.drop_duplicates(subset=['id'], keep='first', inplace=True)
    
    # Remove rows where key metrics are missing or zero (not a valid movie rating)         
    df = df[
        (df['vote_count'] > 0) & 
        (df['vote_average'] > 0) & 
        (df['popularity'].notna()) &
        (df['vote_average'].notna())
    ].copy()
    if len(df) == 0:                                                     # IMDB Weighted Rating Formula Implementation
        return df, 0.0, 0                                                # Formula: W = (v / (v + m)) * R + (m / (v + m)) * C
    C = df['vote_average'].mean()                                        # C = mean vote across the whole report (mean of vote_average)
    m = df['vote_count'].quantile(TMDB_QUANTILE)                         # m = minimum votes required (75th percentile of vote_count)
    m = max(TMDB_MIN_VOTES, int(m))   # Ensure m is an integer and at least TMDB_MIN_VOTES = 50     
    # Apply the function to create the new column                        # R = average for the movie (vote_average)
    df['weighted_score'] = df.apply(weighted_rating, axis=1, m=m, C=C)   # v = number of votes for the movie (vote_count)
    
    # Select final columns for clarity and relevance
    df = df[['title', 'vote_average', 'vote_count', 'popularity', 'release_date', 'release_year', 'weighted_score']].copy()
    return df, C, m

TensorFlow Analysis

Test Project: MNIST Digit Recognition & Image Heuristics

This section showcases the use of the **TensorFlow** library. It will let users draw a digit, and get a prediction from a TensorFlow model. It demonstrates real-time image preprocessing, centering heuristics, and deep learning inference. The MNIST dataset is the 'Hello World' of computer vision. It compares your input to thousands of labeled handwritten digits. A hybrid architecture is implemented here, that uses local K8s-optimized TensorFlow models in development and fails over to Cloud Vision APIs in serverless production to optimize latency and cost.

Interactive Demo: Draw a single digit (0-9) in the box below. A neural network will process the 28x28 grayscale image and attempt to identify it.

Draw Here

Reference Data

Training Set: 60,000 Samples

Heuristic Checks:

Centering: Enabled
Noise Filtering: Active

AI Insights

Neural Input:

Classification Result:

No active prediction. Awaiting draw...

Server-Side Code Snapshot (Python / TensorFlow Implementation):

######################################################################################
###                            TensorFlow ENDPOINT                                 ###
###           Digit recognition (MNIST) - The "hello world" of ML/DL               ###
###         build+train a neural net to recognize hand-written digits              ###
######################################################################################  
@app.route("/predict", methods=["POST"])
def predict():
    if "image" not in request.files:
        return jsonify({"result": "Program error - No image uploaded"}), 400
    try:
        file = request.files["image"]
        arr_2d = api_utils.preprocess_image(file)           # Preprocess to 2D array (28, 28) convert to grayscale, resize, convert to numpy array, normalize ...
        img_b64 = api_utils.get_image_b64(arr_2d)           # Generate Base64 from 2D array for Gemini/Response
        digit = None
        confidence = 0.0
        margin = 0.0
        engine = "TensorFlow (Local)"
        if np.sum(arr_2d > 0.2) < 10:                           # Almost nothing drawn
            digit = "EMPTY"
            result = "Drawing/Canvas is empty or too small."
            confidence = 0.0
        else:
            if api_utils.HAS_TF:                                # Using Local TensorFlow Model...
                model = api_utils.get_or_train_model1()
                input_arr = arr_2d.reshape(1, 28, 28)           # Reshape ONLY for the prediction
                probs = model.predict(input_arr, verbose=0)[0]  # predict Using Local TensorFlow Model
                
                digit = int(np.argmax(probs))
                confidence = float(np.max(probs))
                sorted_probs = np.sort(probs)                   # Additional TF Heuristics
                margin = float(sorted_probs[-1] - sorted_probs[-2])
            else:                                               # TensorFlow not found. Use Google Gemini API...
                gemini_res = api_utils.call_gemini_vision(img_b64)
                digit = gemini_res.get("digit")
                confidence = float(gemini_res.get("confidence", 0.0))
                margin = 0.5 
                engine = "Gemini (Cloud API)"
            # Heuristics for "No Digit"
            if str(digit) == "gemini_ERR":
                result = "Google Gemini API AI Service temporarily unavailable."
            elif str(digit) == "gemini_API":
                result = "Google Gemini API key not valid."
            elif confidence < 0.80 or margin < 0.4:             # Model is guessing/unsure
                digit = "UNSURE"
                source = "TF Model" if api_utils.HAS_TF else "Gemini API"
                result = f"Input is unclear. {source} is guessing/unsure."
            else:
                result = f"{digit} (confidence={confidence:.2f})"
        return jsonify({"result": result,
                        "digit": str(digit),
                        "confidence": str(round(confidence, 4)), 
                        "mnist_img": img_b64,
                        "engine": engine}) 
    except Exception as e:
        print(f"Prediction error: {e}")
        return jsonify({"result": f"Server Error: {str(e)}"}), 500

# ---------------------------------------------------------------------------------- #       
# ---                            TensorFlow with Keras                           --- # 
# ---                            train a model on MNIST                          --- #
# ---                 This will only train on first app start                    --- #
# ---------------------------------------------------------------------------------- #
def get_or_train_model1():
    global mnist_model1
    if mnist_model1 is not None:
        return mnist_model1
    basedir = os.path.abspath(os.path.dirname(__file__))
    model_path = os.path.join(basedir, "mnist_model1.h5")
    try:
        mnist_model1 = tf.keras.models.load_model(model_path)         # Load the model if it exists
        print(f"MNIST {model_path} loaded from disk.")
    except Exception as e:
        print(f"Training a new MNIST model... Reason: {e}")
        (x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
        # x_train, x_test = x_train / 255.0, x_test / 255.0           # Normalize pixel values to be between 0 and 1
        x_train = x_train / 255.0                                     # Normalize pixel values to be between 0 and 1
        mnist_model1 = models.Sequential([                            # Build a simple Sequential model
            layers.Flatten(input_shape=(28, 28)),
            layers.Dense(128, activation='relu'),
            layers.Dense(10, activation='softmax')
            # layers.Dropout(0.2)                                     # Add dropout for better generalization
        ])
        mnist_model1.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])
        mnist_model1.fit(x_train, y_train, epochs=5, batch_size=32, verbose=2)    # Train for 5 epochs
        # mnist_model1.evaluate(x_train, y_train, verbose=0)          # initialize metrics (optional, not needed for most)
        mnist_model1.save(model_path)                                 # Save for future use     
    return mnist_model1

# ---------------------------------------------------------------------------------- #       
# ---                            TensorFlow with Keras                           --- # 
# ---                             preprocess_image 2D                            --- #
# ---             Returns a 2D numpy array (28, 28) normalized to 0-1.           --- #
# ---------------------------------------------------------------------------------- #
def preprocess_image(image_file):
    img = Image.open(image_file).convert("L").resize((28, 28)) # Open and convert to grayscale
    # Invert if necessary (MNIST is white on black) If users draw white on black in HTML, we stay as is.
    inverted_img = img                                         # Assuming white on black
    bbox = inverted_img.getbbox()                              # Find bounding box of the drawing to center it
    if bbox:                                                   # This helps a lot with "squiggles" or off-center drawings
        img = inverted_img.crop(bbox)                          # Crop to the digit and add padding to make it 20x20 (MNIST style)
        img.thumbnail((20, 20), Image.Resampling.LANCZOS)      # Resize the actual drawing to 20x20
        new_img = Image.new("L", (28, 28), 0)                  # Create a new black 28x28 canvas 
        offset = ((28 - img.width) // 2, (28 - img.height) // 2)
        new_img.paste(img, offset)                             # paste the 20x20 digit in the center
        img = new_img
    else:
        img = img.resize((28, 28), Image.Resampling.LANCZOS)   # Empty canvas
    arr = np.array(img)                                        # Convert to array 
    if np.mean(arr) > 127: 
        arr = 255 - arr 
    return arr / 255.0                                         # normalize

# ---------------------------------------------------------------------------------- #       
# ---                            TensorFlow with Keras                           --- # 
# ---              preprocess_image base64 PNG for Google Gemini API             --- #
# ---                RConverts a 2D numpy array (0-1) to base64 PNG.             --- #
# ---------------------------------------------------------------------------------- #
def get_image_b64(arr):
    buf = io.BytesIO()
    # Ensure to use the 2D array here
    pil_img = Image.fromarray((arr * 255).astype("uint8"))
    pil_img.save(buf, format="PNG")
    return base64.b64encode(buf.getvalue()).decode('utf-8')

# ---------------------------------------------------------------------------------- #       
# ---                     Fallback prediction using Gemini API                   --- # 
# ---              If TensorFlow not found. Use Google Gemini API                --- #
# ---------------------------------------------------------------------------------- # 
def call_gemini_vision(img_b64):
    prompt = "Identify the handwritten digit (0-9). Return ONLY a JSON: {\"digit\": number, \"confidence\": float}"
    payload = {
        "contents": [{"parts": [{"text": prompt}, {"inlineData": {"mimeType": "image/png", "data": img_b64}}]}],
        "generationConfig": {"responseMimeType": "application/json"}
    }
    for i in range(2):
        try:
            res = requests.post(GEMINI_URL, json=payload, timeout=10)
            if res.status_code == 200:
                return json.loads(res.json()['candidates'][0]['content']['parts'][0]['text'])
            error_info = res.json().get('error', {})
            error_msg = error_info.get('message', 'Unknown API Error')
            error_code = error_info.get('code', res.status_code)
            print(f"Gemini API Error ({error_code}): {error_msg}")
            if res.status_code != 429:
                return {"digit": "gemini_API", "confidence": 0.0}  # gemini_API
            sleep(2**i)
        except:
            sleep(2**i)
    return {"digit": "gemini_ERR", "confidence": 0.0}

C / C++ Development

Test Project: C / Python Integration (The Bridge)

Leveraging the speed of C within the flexibility of Python via Python C Extensions and ctypes.

fast_math.c (Compiled into a Shared Object fast_math.so Python extension)

/**********************************************************************************************
* METHOD A: Python C-API logic, This function handles Python Objects directly.               *
* Python and C live in two different worlds. Python is a "High-Level" language               *
* (everything is a complex object with metadata), while C is a "Low-Level" language          *
* (everything is raw bytes and memory addresses). This code is the "Translation Layer"       *
* (Boilerplate) required for them to talk to each other.                                     *
* COMPILE: gcc -shared -o fast_math.so -fPIC $(python3-config --includes) fast_math.c        *
**********************************************************************************************/
#define PY_SSIZE_T_CLEAN                                           // use Py_ssize_t (64-bit integers) for all lengths
#include 

// 1. The Calculation Function
static PyObject* method_fast_sum(PyObject* self, PyObject* args) { // PyObject* self: Points to the module itself
    PyObject* list_obj;                                            // PyObject* args: This is a Python Tuple where arguments are all packed 
    if (!PyArg_ParseTuple(args, "O", &list_obj)) {                 // PyArg_ParseTuple: unpack arguments
        return NULL;                                               
    }
    if (!PyList_Check(list_obj)) {                                 // PyList_Check: this piece of memory is a Python List
        PyErr_SetString(PyExc_TypeError, "Parameter must be a list.");
        return NULL;
    }
    long sum = 0;
    Py_ssize_t n = PyList_Size(list_obj);                          // PyList_Size: get the count
    for (Py_ssize_t i = 0; i < n; i++) {
        PyObject* item = PyList_GetItem(list_obj, i);              // PyList_GetItem: the item at index i (This is still a PyObject*)
        sum += PyLong_AsLong(item);                                // PyLong_AsLong: converts the object into a raw 64-bit number 
    }                                                              //                that C can actually add to a variable.
    return PyLong_FromLong(sum);                                   // PyLong_FromLong: converts the number into a Python Object container
}

// 2. The Method Table. This table maps the Python name to the C function address.
static PyMethodDef FastMathMethods[] = {                           // Python_Name, C_fce_Pointer, Type_of_Args, Docstring
    {"fast_sum", method_fast_sum, METH_VARARGS, "Calculate sum of a list quickly."},  // METH_VARARGS - tuple of arguments
    {NULL, NULL, 0, NULL}                                          // The "Sentinel" (VERY IMPORTANT) C arrays don't know their own length.
};

// 3. The Module Definition
static struct PyModuleDef fastmathmodule = {
    PyModuleDef_HEAD_INIT,                        // Standard internal header
    "fast_math",                                  // The name used in: import fast_math
    "C optimized math functions for Python",      // The module's description
    -1,                                           // Global state flag (-1 = simple/stateless)
    FastMathMethods                               // Points back to the Method Table
};

// 4. The Entry Point (PyInit_fast_math) This is what Python looks for during 'import fast_math'
PyMODINIT_FUNC PyInit_fast_math(void) {           // The Naming Rule - it is fixed
    return PyModule_Create(&fastmathmodule);      // Entry Point calls PyModule_Create()
}

Server-Side Python Code Snapshot: c_bridge.py & AJAX endpoint

######################################################################################
#  Method A (Python C-API) create a module that import like a normal Python library. #
#                      Loading the compiled fast_math.so library                     #
#                           calling the C function directly                          #
######################################################################################  
def c_bridge(count):
  hw_header = get_hardware_info()
  benchmark_output = "-" * 65 + "\n---    Performance Benchmark: C Extension vs. Pure Python     ---\n" + "-" * 65
  try:
      # import compiled module like a normal library
      import fast_math
      benchmark_output += "\nSUCCESS: fast_math module imported correctly!"
      
      # Create a large dataset (10 Million items)  count = 10_000_000 
      data = list(range(count))
      benchmark_output += f"\nDataset size: {count:,} integers\n"
      
      # --- TEST 1: C-EXTENSION ---
      start_c = time.perf_counter()
      result_c = fast_math.fast_sum(data) # fast_math - C module
      end_c = time.perf_counter()         # fast_sum  - function in C module
      time_c = end_c - start_c
      
      # --- TEST 2: PURE PYTHON LOOP ---
      start_py = time.perf_counter()
      sum_py = 0
      for n in data:
          sum_py += n
      end_py = time.perf_counter()
      time_py = end_py - start_py
      
      # --- TEST 3: PYTHON BUILT-IN SUM (Which is also written in C) ---
      start_builtin = time.perf_counter()
      sum_builtin = sum(data)
      end_builtin = time.perf_counter()
      time_builtin = end_builtin - start_builtin
      
      # Display Results
      benchmark_output += f"\n{'Method':<25} | {'Result':<20} | {'Execution Time':<15}\n"
      benchmark_output += "-" * 65
      benchmark_output += f"\n{'C-Extension (Custom)':<25} | {result_c:<20} | {time_c:.6f} s"
      benchmark_output += f"\n{'Python Manual Loop':<25} | {sum_py:<20} | {time_py:.6f} s"
      benchmark_output += f"\n{'Python Built-in sum()':<25} | {sum_builtin:<20} | {time_builtin:.6f} s\n"
      benchmark_output += "-" * 65
      if time_c > 0:
          speedup = time_py / time_c
          benchmark_output += f"\nConclusion: C-Extension is ~ {speedup:.0f}x faster than a manual Python loop."
  except ImportError as e:
      benchmark_output += f"\nERROR: Could not find fast_math. {e}"
  return f"{hw_header}\n{benchmark_output}"

######################################################################################
###       C / Python Integration and Performance Benchmark Test AJAX endpoint      ###
###    run the Performance Benchmark C / Python - tests the sum of 1 million int   ###
######################################################################################
from c_bridge import c_bridge
@app.route('/api/runCbridge', methods=['POST'])
def runCbridge():
    try:
        benchmark_output,time_c,time_py = c_bridge(10_000_000) # Create a large dataset (10 Million items)
        return jsonify({
            "status": "success",
            "result": benchmark_output,
            "time_c": time_c,
            "time_py": time_py
        })
    except Exception as e:
        print(f"Performance Benchmark test failed: {e}")
        return jsonify({"error": f"Internal Server Error: {str(e)}"}), 500

Performance Benchmark - Sum of 1,000,000 int numbers - TERMINAL OUTPUT:
Click the "RUN Performance Benchmark" button above to execute the server-side Python function and see the performance results here.

C / C++ Key Competencies:

Systems & Performance

High-speed processing and memory management.

File I/O: High-performance stream processing
Memory: Pointers, references, and manual management
Efficiency: Fast procedural algorithms and recursion
Smart Pointers: modern memory safety (unique/shared)

Modern OOP (C++17/20)

Class Hierarchies
Inheritance & Polymorphism
Dynamic Binding
Operator Overloading
Constructors and Destructors

Copy & Move Semantics
STL (Vectors, Maps, etc.)
Exception Handling
Range-based Loops

Performance-First Mindset: I leverage C++ for tasks where execution speed is critical. While I am proficient in modern Object-Oriented design and C++17/20 standards, I apply C-style efficiency for direct system-level processing and high-throughput data operations.

Docker & Kubernetes

Test Project: Containerization & Live Deployment

This entire portfolio application is containerized with Docker and deployed as a serverless microservice on Google Cloud Platform (GCP),
demonstrating a full production-ready lifecycle.

Implementation Specs and Technical strategies used in this project:

Multi-stage Builds: Compiling C-extensions in a builder stage to keep the final runtime image under 450MB.
Layer Optimization: Strategic command ordering to maximize Docker build-cache efficiency and reduce build times.
Cloud Run: Deployed as an OCI-compliant container for high-availability, auto-scaling serverless execution.
.dockerignore Optimization: Explicitly excluding venv, .env, and serviceAccountKey.json to ensure image security and minimal size.
Requirements Pruning: Using tensorflow-cpu and removing dev-dependencies to shrink the production footprint.
Aggressive Pruning: Removing __pycache__, tests, and library documentation during the build process.

Docker Development

Daily commands for local and remote development.

Image Creation & Tagging (Versioning)
Data Persistence (Volumes & Bind Mounts)
Multi-Container Networking & DNS
Docker Compose for local Orchestration

Kubernetes (K8s)

Bootcamp-certified concepts ready for deployment when needed.

K8s Architecture (Control Plane/Nodes)
Scalable Deployments & ReplicaSets
K8s Services (Load Balancers/Ingress)
Self-healing & Automated Rollouts

Proficient in the full DevOps lifecycle: packaging Python/C++ environments into standard images and managing them via Kubernetes for automated scaling and resilience.

Dockerfile (Multi-Stage)

###################################################################################
#                        Multi-stage build for efficiency                         #
#                                STAGE 1: Builder                                 #
###################################################################################
FROM python:3.11-slim AS builder

# Install system dependencies needed for compiling C extensions
RUN apt-get update && apt-get install -y \
    gcc \                                                                 
    python3-dev \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy only the files needed for the build first (optimization)
COPY requirements.txt .
COPY fast_math.c .

# Compile the C extension - use python3-config to ensure link against the correct headers
RUN gcc -shared -o fast_math.so -fPIC $(python3-config --includes) fast_math.c

# Install requirements into a local folder
RUN pip install --target=/app/pkgs --no-cache-dir -r requirements.txt && \
    rm -rf /app/pkgs/nvidia*

# AGGRESSIVE PRUNING (This reduces size significantly) remove: __pycache__, tests, documentation, and compiled object files
RUN find /app/pkgs -name "__pycache__" -type d -exec rm -rf {} + && \
    find /app/pkgs -name "*.pyc" -delete && \
    find /app/pkgs -name "*.pyo" -delete && \
    find /app/pkgs -name "*.dist-info" -type d -exec rm -rf {} + && \
    rm -rf /app/pkgs/tensorflow/include && \
    rm -rf /app/pkgs/numpy/tests && \
    rm -rf /app/pkgs/pandas/tests

###################################################################################
#                             STAGE 2: Final Runtime                              #
###################################################################################
FROM python:3.11-slim AS runtime
WORKDIR /app

# Copy only the compiled extension and the installed packages
COPY --from=builder /app/fast_math.so .                                  
COPY --from=builder /app/pkgs /app/pkgs

# Copy the rest of the app (obeying .dockerignore)
COPY . .

# Set Environment Variables
ENV PYTHONPATH=/app/pkgs:.
ENV TF_ENABLE_ONEDNN_OPTS=0
ENV TF_CPP_MIN_LOG_LEVEL=2
ENV PORT=8080

CMD ["python3", "-m", "gunicorn", "--bind", "0.0.0.0:8080", "--workers", "2", "--timeout", "0", "api:app"]

GCP_Deployment_Script.sh

#!/bin/bash
######################################################################################
###                         GCP_Deployment_Script.sh                               ###
###################################################################################### 

# ---------------------------------------------------------------------------------- #
# ---                      Check if gcloud is installed                          --- #
# ---------------------------------------------------------------------------------- #
if ! command -v gcloud >/dev/null 2>&1; then
    echo "❌ ERROR: gcloud CLI not found. Please install it first."
    exit 1
fi

# ---------------------------------------------------------------------------------- #
# ---                    Set and Load variables from .env                        --- #
# ---------------------------------------------------------------------------------- #
REGION="europe-west1"
SERVICE_NAME="portfolio-app"
if [ -f .env ]; 
then
    export $(grep -v '^#' .env | xargs)
    echo "✅ Environment variables loaded from .env"
else
    echo "❌ ERROR: .env file not found."
    exit 2
fi

# ---------------------------------------------------------------------------------- #
# ---                            Run the deployment                              --- #
# ---------------------------------------------------------------------------------- #
echo "🚀 Deploying $SERVICE_NAME to Google Cloud Run ($REGION)..."
gcloud run deploy $SERVICE_NAME \
  --source . \
  --region $REGION \
  --allow-unauthenticated \
  --set-env-vars "API_KEY=$API_KEY,API_TOKEN=$API_TOKEN,FLASK_ENV=PRODUCTION,GEMINI_API_KEY=$GEMINI_API_KEY"

# Note: don't pass GOOGLE_APPLICATION_CREDENTIALS on GCP.
# The code in api_utils.py will see it is missing and use the Service Identity.

echo "✅ Deployment complete!"

# ---------------------------------------------------------------------------------- #
# ---                   Important steps before deployment                        --- #
# ---------------------------------------------------------------------------------- #
# 1. prepare requirements.txt
# pipreqs . --force                                          < -- generate fresh clean requirements.txt
# don't forget to add: gunicorn==23.0.0
# replace: tensorflow==2.20.0 with tensorflow_cpu==2.20.0    < -- local docker
# remove: tensorflow_cpu==2.20.0                             < -- Google Cloud Run
# remove all spaces and quotes

# 2. prepare Dockerfile

# 3. prepare .dockerignore 

# 4. build image
# docker build -t portfolio-app .
# docker images portfolio-app                                      < -- check size < 450MB
# docker run --rm --env-file .env portfolio-app env | grep API_KEY < -- check .env variables

# 5. test run Docker locally
# docker run --rm -it -p 8080:8080 \
#   --env-file .env \
#   -v $(pwd)/serviceAccountKey.json:/app/serviceAccountKey.json \
#   portfolio-app
# docker stop  or Ctrl+C

# 6. deployment
# Replace `europe-west1` with preferred region
# gcloud run deploy portfolio-app \
#   --source . \
#   --region europe-west1 \
#   --allow-unauthenticated \
#   --set-env-vars="API_KEY=abc, API_TOKEN=abc, ..."

DevOps Lifecycle: I focus on building optimized, OCI-compliant images that are portable across local Docker environments and production-grade Kubernetes clusters.

Linux System Administration

Proficient in managing Debian-based distributions (Ubuntu/Debian) as primary development and deployment environments.

Core Competencies, Real-world Application:
(Every component of this portfolio relies on Linux)

Docker containers built on python:slim (Debian)
Python venv isolation on Linux kernel
GCC compilation of C-extensions via terminal
Deploying via Google Cloud SDK (gcloud CLI)

System Foundations

User & Group Permissions (chmod/chown)
SSH Key Management & Secure Access
Package Management (APT/DPKG)

Shell & Automation

Bash Scripting for Task Automation
Environment Variable Configuration
Process Monitoring (top/htop/ps)

Distro Focus: Primary experience with Ubuntu Server and Debian. Familiar with the Linux Filesystem Hierarchy (FHS) and standard diagnostic tools for networking and performance.

Git & Version Control

Key Competencies:

Understand how Git works behind the scenes
Git objects: trees, blobs, commits, and annotated tags
Master the essential Git workflow: adding & committing
Work with Git branches
Perform Git merges and resolve merge conflicts
Use Git diff to reveal changes over time
Master Git stashing
Undo changes using git restore, git revert, and git reset
Work with local and remote repositories
Master collaboration workflows: pull requests, "fork & clone", etc.
Squash, clean up, and rewrite history using interactive rebase
Retrieve "lost" work using git reflogs
Write custom and powerful Git aliases
Mark releases and versions using Git tags
Host static websites using Github Pages
Create markdown READMEs
Share code and snippets using Github Gists

Core Workflow

Daily commands for local and remote development.

Basic Plumbing: init, config, status, log
Staging Area: Mastery of add, commit, and .gitignore
Branching: Creation, checkout/switch, and basic merging
Remote Work: Push, pull, fetch, and remote tracking
Documentation: Markdown READMEs & GitHub Pages

Advanced Capabilities

Bootcamp-certified concepts ready for deployment when needed.

History Control: Interactive rebase, squash, and reflogs
Collaboration: Pull requests, "fork & clone", and conflict resolution
Undo/Recovery: Restore, revert, reset, and stashing
Internals: Understanding blobs, trees, and commit objects
Release Management: Semantic tagging and versioning

Familiar with custom Git aliases and automation to optimize development speed.

Prometheus and Grafana Monitoring

System Observability Workflow

This section demonstrates a complete machine monitoring workflow. Using a Prometheus exporter, the application tracks custom metrics (like bot traffic) and exposes them via a secure endpoint. Grafana Cloud then scrapes this data to visualize real-time system health.

Key Monitoring Concepts Demonstrated:

Prometheus: Time-series data collection and metric exportation.
Grafana: Multi-platform analytics and interactive visualization.
Snapshot API: Programmatic generation of shareable dashboard instances.

Interactive Demo: Live Monitoring - Show Grafana Dashboard

Click the button above to generate a real-time snapshot of server metrics.

The snapshot will open in a new window to bypass browser security restrictions

and shows this queries:

🤖 Bot Visits (Last Hour):
Query: increase(portfolio_bot_traffic_total[1h])

👥 Total User Visits:
Query: sum(flask_http_request_total{job!="portfolio-bot-stats", path!="/metrics"})

🟢 System Status:
Query: time() - process_start_time_seconds{scrape_job="portfolio-bot-stats"}

📊 Cumulative Traffic (Stacked):
Query: flask_http_request_total - portfolio_bot_traffic_total

⏱️ Container Uptime:
Query: time() - process_start_time_seconds

📉 Traffic Rate Comparison:
Query: rate(portfolio_bot_traffic_total[$__rate_interval])

Server-Side Implementation (Python / Prometheus Client):

# ---------------------------------------------------------------------------------- #
# ---                Initialize the metrics, Prometheus + Grafana                --- #
# ---                   static_labels adds metadata to every metric              --- #
# ---------------------------------------------------------------------------------- #
metrics = PrometheusMetrics(app, path=None, static_labels={'app': 'portfolio-backend'})

BOT_TRACKER = Counter('portfolio_bot_traffic_total', 'Total requests from bots', ['bot_type'])
BOT_TRACKER.labels(bot_type='crawler').inc(0)  
def metrics_app(environ, start_response):      # Create a "Metrics App" wrapper so the /metrics path isn't lost 
    environ['PATH_INFO'] = '/'                 # ensure the path remains /metrics so the library catches it
    return app.wsgi_app(environ, start_response)

@app.route('/metrics')
def metrics_route():
    # Grab the header. Check both names (Firebase sometimes renames it)
    auth_header = request.headers.get('Authorization') or request.headers.get('X-Forwarded-Authorization')
    print(f"DEBUG: Auth Header Received: {auth_header}")           # DEBUG: Log the header to terminal (local) or GCP Logs (production)
    if auth_header and auth_header.startswith('Basic '):
        try:                                                       # Decode: "Basic string" -> "user:pwd"
            encoded_str = auth_header.split(' ')[1]
            decoded_str = base64.b64decode(encoded_str).decode('utf-8')
            username, password = decoded_str.split(':')
            
            if username == api_utils.GRAFANA_USER and password == api_utils.GRAFANA_PWD: 
                return Response(generate_latest(), mimetype=CONTENT_TYPE_LATEST)
            else:
                print(f"DEBUG: Password mismatch for user: {username}")
        except Exception as e:
            print(f"DEBUG: Auth parsing failed: {e}")
    return Response(
        'Authentication Required', 401,
        {'WWW-Authenticate': 'Basic realm="Grafana Metrics"'}
    )

Bootstrap

This responsive website was built using **Bootstrap 5**, leveraging a modified version of the Start Bootstrap - Grayscale theme, HTML5, CSS3, JavaScript, ... The process demonstrates hands-on experience with modern front-end workflow and deep customization of a major framework.

Immediate Skills Demonstrated:

**Bootstrap 5:** Component usage, utility classes, and custom SASS/CSS overrides.
**Responsive Design:** Customizing layout (e.g. fixed columns) across various breakpoints.
**HTML5 & CSS3:** Core structural and styling modifications.
**Development Environment:** Linux (Kubuntu), VS Code, Wayland.
**Version Control (Git):** Advanced usage of Git for branching, merging, and collaboration workflows.

About Me

I am a Full-Stack Developer with a primary focus on Backend engineering. I like to do everything related to computers and programming. I enjoy the whole process of turning ideas into code. I build something from nothing. I learn new technologies, programming languages, frameworks, etc. I figure out how things work. My passion for computers started a long time ago and was mainly connected with the automotive industry (UNIX/LINUX, C/C++, KSH, CATIA, VB, ...) A few years ago I decided to expand my knowledge and experience with modern technologies and newer, high-demand skills. Some of the skills and technologies can be found on this testing webpage.

I operate exclusively on a native Linux environment, which allows me to maintain a development workflow that is identical to production cloud environments. This "production-first" mindset ensures that the code I write is robust, container-ready, and optimized for modern infrastructure.

"Implemented a hybrid architecture that uses local K8s-optimized TensorFlow models in development and fails over to Cloud Vision APIs in serverless production to optimize latency and cost."

Quick Facts

Development Environment
Native Linux (Debian/Ubuntu/Kubuntu)
Bootcamp Graduate
Intensive Full-Stack & DevOps: Python, Docker, Kubernetes

Professional Background
Transitioning with a focus on reliability and systematic problem-solving.

Current Focus
Cloud-Native applications, Backend Architecture & Performance Tuning

Learning Roadmap & Future Integration

I am committed to continuous technical growth. My current focus is expanding this portfolio into a distributed system to demonstrate advanced backend and cloud-native patterns:

Advanced Backend & Architecture

Microservices: Decoupling logic into independent, scalable services.

Elasticsearch: Implementing high-performance full-text search.

Asynchronous Tasks: Using Redis/Celery for background processing.

Cloud Ops & Observability

Terraform (IaC): Automating GCP infrastructure deployment.

Monitoring: Visualizing metrics with Grafana & Prometheus.

CI/CD: Advanced GitHub Actions for automated testing/delivery.

Get In Touch

Radimir Dedecek

Senior Software Developer | Backend Specialist

I am currently open to discussing new projects, technical collaborations, and professional opportunities. Feel free to connect via the platforms below.

Location

Nova Viska 66
46331 Nova Ves, CZ

Email

radimir.dedecek@gmail.com

Phone

+420 608 260 026

GitHub

portfolio-repo

LinkedIn

/in/radimirdedecek

Technology Stack

Professional Overview

Languages & Core

Web & Data

DevOps & Systems

What You Will Find Here:

Python

Interactive Demo: Python Performance (3 x Fibonacci) & Testing

Server-Side Python Code Snapshot: api.py

Server-Side Python Code Snapshot: api_test.py

Technical Toolkit: Python Ecosystem

Automation & Scraping

Data Science & Viz

Backend & Database

REST API

Test Project: External API Consumption & Data Pipeline Foundation

Focus Area: TMDB Data Acquisition

Server-Side Code Snapshot (Python / Rest API Implementation / Cache Check Logic):

Interactive Demo: Fetch Top Movies

Key Competencies Demonstrated:

Pandas & Data Analysis

Test Project: Data Cleaning, Transformation, and Aggregation

Key Analytical Tasks:

Interactive Demo: Run Pandas Analysis

Server-Side Code Snapshot (Python / Pandas Implementation):

NumPy Data Preprocessing

Test Project: NumPy Data Preprocessing

Interactive Demo: Run NumPy Analysis (Standardization & Normalization)

Server-Side Code Snapshot (Python / NumPy Implementation):

SCIKIT-LEARN & Predictive Modeling

Test Project: Regression Analysis and Model Evaluation

Key ML Concepts Demonstrated:

Interactive Demo: Run Scikit-learn Regression

Server-Side Code Snapshot (Python / Scikit-learn Implementation):

TensorFlow Analysis

Test Project: MNIST Digit Recognition & Image Heuristics

Interactive Demo: Draw a single digit (0-9) in the box below. A neural network will process the 28x28 grayscale image and attempt to identify it.

Draw Here

Reference Data

AI Insights

Server-Side Code Snapshot (Python / TensorFlow Implementation):

C / C++ Development

Test Project: C / Python Integration (The Bridge)

fast_math.c (Compiled into a Shared Object fast_math.so Python extension)

Server-Side Python Code Snapshot: c_bridge.py & AJAX endpoint

C / C++ Key Competencies:

Systems & Performance

Modern OOP (C++17/20)

Docker & Kubernetes

Test Project: Containerization & Live Deployment

Implementation Specs and Technical strategies used in this project:

Docker Development

Kubernetes (K8s)

Dockerfile (Multi-Stage)

GCP_Deployment_Script.sh

Linux System Administration

Core Competencies, Real-world Application: (Every component of this portfolio relies on Linux)

System Foundations

Shell & Automation

Git & Version Control

Key Competencies:

Core Workflow

Advanced Capabilities

Prometheus and Grafana Monitoring

System Observability Workflow

Key Monitoring Concepts Demonstrated:

Interactive Demo: Live Monitoring - Show Grafana Dashboard

Server-Side Implementation (Python / Prometheus Client):

Bootstrap

Immediate Skills Demonstrated:

About Me

Quick Facts

Learning Roadmap & Future Integration

Advanced Backend & Architecture

Cloud Ops & Observability

Get In Touch

Location

Email

Phone

Core Competencies, Real-world Application:
(Every component of this portfolio relies on Linux)