Building a Local RAG System for My Research

I’ve been working on a practical application of local language models: building a Retrieval-Augmented Generation (RAG) system to interact with my personal collection of hydrology research papers. The goal is to create a tool that can accurately answer questions based on a curated set of documents, moving beyond the general knowledge of a standard LLM. This document outlines the process, the tools I used, and my understanding of how it all works.

4 min read

Use Claude Code and Gemini CLI for Vibe Coding on HPC

It is crazy how fast Gen-AI for coding has been advancing recently, with game-changing tools coming out on a weekly basis. Thanks to that, my own workflow has evolved dramatically in the last few months. It feels like yesterday I was copying and pasting code from a ChatGPT window, and now I’m doing ‘vibe coding’ with CLI like Claude Code and Gemini CLI. Today, with some help from Gemini, I successfully installed these state-of-the-art tools on a HPC cluster, so that I don’t need to install them on my local computer. Here’s how I achieve this. It includes the initial Node.js installation, Gemini CLI and Claude Code installation.

3 min read

Making Interactive Web Charts with Help from AI

I was playing with generative AI, like ChatGPT or Google’s Gemini lately. It’s pretty wild what they can do. Today I wanted to share how it helped me create an interactive chart.

2 min read

A Comprehensive Overview of Hydrologic Models

This is a summary generated by ChatGPT after I had a long discussion with it about hydrologic models. I’d like to post it here but keep in mind that some information was not verified.

3 min read

Efficient Access and Data Management with Supercomputers

Accessing supercomputers like NERSC typically requires entering both a password and a one-time password (OTP) each time you log in or transfer data, which is annoying. Fortunately, NERSC offers a service called sshproxy, which facilitates the generation of a temporary SSH key file with a 24-hour validity. This method significantly reduces the need for frequent password entries.

1 min read

Dealing with Line Endings in Git

Recently, while managing my Jekyll blog hosted on GitHub Pages, I stumbled upon a peculiar issue. Every time I added a new blog post, the entire content would appear on the homepage instead of just the title. After some digging, I realized the problem stemmed from the way line endings were being handled. As I dove deeper, I learned how widespread this issue is for developers collaborating across different operating systems. Here’s the full story and how you can prevent similar hiccups in your projects.

1 min read

Using coupled simulation for offline runs

Sometimes I conduct offline land simulations driven by the climate output from a coupled simulation. This approach is time-efficient for some land-feature tests, especially when the feedback loops between the land and the atmosphere aren’t the primary concern. Here are the steps to accomplish this:

1 min read

Updated E3SM Workflow

I was recently requested to conduct a coupled E3SM simulation experiment. This task provids me with the opportunity to familiarize myself with the latest best-practice simulation standards. An official step-by-step guide can be found here.

3 min read

Useful Links

Want to have a one-stop page for useful datasets, links, and other resources.