written by Eric J. Ma on 2025-10-04 | tags: llm agents coding automation markdown testing package memory workflow scripts
In this blog post, I share how using AGENTS.mdâa new open standard for AI coding agentsâlets you teach your LLM assistant project-specific preferences that persist across sessions. I cover practical tips like enforcing markdown standards, specifying test styles, and introducing new tools, all by updating AGENTS.md. This approach turns your agent into a trainable teammate, not just a forgetful bot. Want to know how to make your coding agent smarter and more aligned with your workflow?
Read on... (1448 words, approximately 8 minutes reading time)written by Eric J. Ma on 2025-10-01 | tags: biotech ultralearning datascience lifesciences software learning career skills modeling feedback
In this blog post, I share how effective biotech data scientists master both life sciences and software skills by applying Scott Young's ultralearning principles. Drawing from my own experience, I explain how to strategically bridge knowledge gaps, focus on real-world projects, and alternate deep dives between domains for continuous growth. Want to know which ultralearning strategies can help you level up your biotech data science career?
Read on... (4570 words, approximately 23 minutes reading time)written by Eric J. Ma on 2025-09-02 | tags: python pixi uv mkdocs automation ai scaffolding integration tooling workflows
In this blog post, I share how I've completely revamped The Data Science Bootstrap Notes for 2025, reflecting major changes in Python tooling and best practices. I discuss moving from conda to pixi and uv, automating project setup with pyds-cli, integrating AI thoughtfully, and embracing CI/CD for reproducible workflows. I also highlight the core philosophies that guide my approach and explain what outdated advice I've removed. Curious how these changes can help you build scalable, modern data science projects?
Read on... (1100 words, approximately 6 minutes reading time)written by Eric J. Ma on 2025-09-01 | tags: productivity negotiation presentations llm automation communication competencies ghostwriting updates ai
In this blog post, I share 10 practical ways I've used AI and large language models to save time and boost my effectiveness at workâbeyond just coding and emails. From crafting tailored presentations and prepping for negotiations to automating tedious forms and practicing tough conversations, these strategies help you focus on what really matters. Want to know how AI can help you work smarter, not harder, beyond 2025?
Read on... (2157 words, approximately 11 minutes reading time)written by Eric J. Ma on 2025-08-24 | tags: biotech communication decisions statistics translation collaboration trust meetings probability stakeholders
In this blog post, I share practical strategies for data scientists and statisticians to communicate more effectively with lab scientists in biotech. Instead of overwhelming collaborators with methods, I explain how to focus on decision-making, translate complex analyses into actionable probabilities, and build trust through clarity. I also offer tips for structuring meetings and anticipating common questions. Want to know how to make your insights drive real decisions in the lab?
Read on... (3415 words, approximately 18 minutes reading time)written by Eric J. Ma on 2025-08-23 | tags: python runtime llm security namespace compilation execution functions toolbot monkeypatching
In this blog post, I share how I discovered a powerful Python trick: dynamically changing a function's source code at runtime using the compile and exec functions. This technique enabled me to build more flexible AI bots, like ToolBot, that can generate and execute code with access to the current environment. While this opens up exciting possibilities for LLM-powered agents and generative UIs, it also raises serious security concerns. Curious how this hack can supercharge your AI projectsâand what risks you should watch out for?
Read on... (2187 words, approximately 11 minutes reading time)written by Eric J. Ma on 2025-08-15 | tags: productivity workflows evaluation metrics business science models ai tools measurement
In my latest post, I share how large language models are changing the data science landscapeânot by replacing us, but by making us more effective and opening up new opportunities to build custom AI solutions. I discuss why our skills in measurement and evaluation are more valuable than ever. Curious how data scientists can thrive in the LLM era?
Read on... (913 words, approximately 5 minutes reading time)written by Eric J. Ma on 2025-08-06 | tags: bayesian variance r2d2 dirichlet multilevel glm regularization priors inference pymc
In this blog post, I share my journey exploring the R2D2 framework for Bayesian modeling, which lets you intuitively control model fit by placing a prior on R² instead of individual coefficients. I walk through its elegant extensions to generalized linear and multilevel models, showing how it automatically allocates explained variance and prevents overfitting. Curious how this approach can simplify your modeling and highlight the most important factors in your data?
Read on... (2301 words, approximately 12 minutes reading time)written by Eric J. Ma on 2025-07-21 | tags: automation ai memory design coding testing architecture prototyping review pairing
In this blog post, I share how months of hands-on struggle and learning paved the way for me to ship a complex graph-based memory feature for Llamabot in just two daysâwith AI as my design partner. I explain why you have to "earn your automation" and how AI can amplify, not replace, your critical thinking. Curious how pairing deep preparation with AI can supercharge your workflow and lead to breakthroughs?
Read on... (2449 words, approximately 13 minutes reading time)written by Eric J. Ma on 2025-07-15 | tags: xarray bioinformatics reproducibility cloud workflow alignment features laboratory datasets scaling
In this blog post, I share how using xarray can transform laboratory and machine learning data management by unifying everythingâmeasurements, features, model outputs, and splitsâinto a single, coordinate-aligned dataset. This approach eliminates the hassle of index-matching across multiple files, reduces errors, and makes your workflow more reproducible and cloud-ready. Curious how this unified structure can simplify your experimental data analysis and save you time? Read on to find out!
Read on... (1479 words, approximately 8 minutes reading time)