written by Eric J. Ma on 2025-02-17 | tags: packaging uv tools marimo juv environments
In this blog post, I explore how modern Python tooling is flipping the script on the age-old "which Python should I use?" question. Through my experience with uvx
, marimo
, and juv
, I show how we're moving away from the traditional headache of environment setup and toward a world where tools automatically ship you the exact Python you need. No more environment setup puzzles – just specify your Python version and get straight to work. It's a liberating shift that's changing how I approach one-off Python work, and I think it's pretty exciting!
written by Eric J. Ma on 2025-02-07 | tags: refactoring llamabot optimization docker python cli packages performance engineering
In this blog post, I share my journey of tackling dependency bloat in LlamaBot. What began as a simple LLM bot framework had grown into a monolithic system with an extensive dependency chain, leading to massive installation sizes. By mapping dependencies, refactoring the code, and organizing optional dependencies, I managed to reduce the container size significantly. This exercise taught me the importance of regular codebase maintenance and focusing on core functionalities. Now, LlamaBot is leaner and more efficient. Curious about the strategies I used to achieve this transformation?
Read on... (1510 words, approximately 8 minutes reading time)written by Eric J. Ma on 2025-01-31 | tags: large language models python llamabot pydantic structuredbot agentbot talks meetups
In this blog post, I explore the concept of 'what makes an agent' by discussing various implementations of LlamaBot, a Python package for LLM exploration. I dissect the differences between SimpleBots, StructuredBots, and AgentBots, highlighting their capabilities and limitations in terms of agency and decision-making. Through audience discussions and examples, I aimed to provoke thought on the definition and design of agents, and together, we had an engaging discussion. Can we truly define an agent, or is it like the Turing Test, a concept that evolves with our understanding and technological advancements?
Read on... (3331 words, approximately 17 minutes reading time)written by Eric J. Ma on 2025-01-19 | tags: biotech datasets machine learning research data fusion decision support systems data science
In this blog post, I explore the challenges biotech teams face when integrating public datasets with internal data for machine learning. Despite initial excitement, issues like data compatibility, missing variables, domain shifts, and biological complexity often arise. I suggest a shift from a machine learning perspective to a decision support approach, advocating for separate models and a decision fusion layer that incorporates human expertise. This method respects the complexity of biological systems and aids in effective decision-making. How can we better navigate these challenges to accelerate biotech discoveries?
Read on... (1373 words, approximately 7 minutes reading time)written by Eric J. Ma on 2025-01-13 | tags: dictation accessibility productivity artificial intelligence writing workflow voicepal creativity
When typing became physically demanding, I discovered that dictation tools could do more than just help me write – they could fundamentally change how I capture and develop ideas. Using Better Dictation and VoicePal, combined with AI assistance, I found a way to write that matches the natural flow of thought. This isn't just about accessibility or working around limitations; it's about finding a better way to translate the nonlinear, rapid-fire nature of our thoughts into written words. I share my approach to preserving authentic voice while using AI tools, and why sometimes constraints push us toward unexpected improvements in how we work.
Read on... (801 words, approximately 5 minutes reading time)written by Eric J. Ma on 2025-01-10 | tags: cybersecurity pre-commit hooks jupyter secrets management best practices data science
In this post, I share a practical approach to managing secrets in data science workflows, learned from personal experience with both successes and mistakes. I cover essential tools like direnv
and .env
files for local development, strategies for secure secret handling in Jupyter notebooks, and crucial version control practices including pre-commit hooks to catch accidental API key commits. I also discuss team collaboration approaches for secret sharing, platform-specific secrets management features, and what to do when secrets accidentally get committed to repositories. While tools like AWS Secrets Manager exist for enterprise needs, I focus on practical, accessible methods that create robust security through layered defenses, following proven software development principles that apply equally well to data science work.
written by Eric J. Ma on 2025-01-04 | tags: llms automation workflows tools agents
In this blog post, I explore what defines an LLM agent, highlighting its goal-oriented non-determinism, decision-making flow control, and natural language interfaces. I also discuss when to use agents, emphasizing the importance of variable scope inputs and constrained actions. By examining industry perspectives from Anthropic and Google, I also explore how agents can effectively handle diverse inputs while maintaining defined action boundaries. Real-world examples, like a bill calculation bot and a literature research assistant, illustrate these principles. How can these insights transform your approach to designing agent applications?
Read on... (1900 words, approximately 10 minutes reading time)written by Eric J. Ma on 2024-12-31 | tags: blogging consistency ai content llms data biotech career writing discovery
In this blog post, I reflect on my year-long challenge of writing a blog post every week, surpassing my goal with 53 posts. This journey taught me the power of consistency, improved my ability to communicate complex ideas, and helped me develop AI-assisted tools to streamline my workflow. I also explored the intersection of life sciences and computation, aiming to accelerate scientific discovery. How did these experiences shape my approach to integrating AI into creative processes and what insights can you gain from my journey?
Read on... (2733 words, approximately 14 minutes reading time)written by Eric J. Ma on 2024-12-20 | tags: docling nougat llms document parsing gpu
In this blog post, I explore the challenges of extracting structured text from PDFs, especially when dealing with equations, tables, and figures. I discuss two tools, Nougat-OCR by Facebook Research and Docling by IBM, which I found effective for this task. Nougat-OCR excels at handling equations and tables, while Docling excels on extracting figures. By combining these tools, we can develop a workflow that captures all critical components of a PDF. Want to know how to retain valuable knowledge from complex PDFs?
Read on... (916 words, approximately 5 minutes reading time)written by Eric J. Ma on 2024-12-17 | tags: professional growth leadership relationships networking organizational change professional development
In this blog post, I explore how to navigate and thrive during organizational changes. I share personal insights and practical strategies, such as focusing on meaningful relationships with colleagues, consistently delivering great work, and proactively building your career path. I also emphasize the importance of staying present and cultivating a 'career committee' of trusted advisors. Change is inevitable in any organization, but how we respond can transform these shifts into growth opportunities. Curious about how to build your own resilience in changing times?
Read on... (592 words, approximately 3 minutes reading time)