• How to install mongodb on crostini (or, debian bookwarm)

    If you come across libssl1.1 issue, there’s an option of installing the library via ubuntu repository OR installing the mongodb itself from ubuntu repository. I personally prefer the latter since I don’t like having a single apt setting just for a single lib. Just follow https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-ubuntu/. If you don’t know if you’re on bookwarm, this…

    Tags:

  • 메트릭에 근간한 개발

    구글이 공유한 머신 러닝의 첫번째 규칙은 “Don’t be afraid to launch a product without machine learning.” 이다. 이를통해 가장 먼저 메트릭을 정의하고, 동작하는 파이프라인을 만들고, 실패하고, 펑가하고, 개선할 수 있기 때문이다. Stripe의 머신 러닝 엔지니어인 Emmanuel Ameisen 역시 “ML is an iterative process where the fastest way to make progress is to see how a…

  • How to reliably use llm to get json outputs

    When using LLM, esp., for getting json output, there are many things that go wrong. I’ll explain some of them in this post. Prompt First is to ask LLM to generate json as output. I use the following: Every LLM generates all the different errors. Tweak the above as necesary. Parsing Even with a strong…

    Tags:

  • .[x]profile, .[x]rc files

    I always forget what those .bash_profile, .bashrc, .zprofile, .zshrc do, so I’m summarizing them here: What does this mean? It’s natural for some code appear in both .profile and .rc, e.g., pyenv recommends it if certain non interactive login shell script relies on the pyenv.

    Tags:

  • How to fix impersonation error in using vertexai

    Somehow the authentication part isn’t easy to find in a single place, so I’m adding it. For local dev, run: That’ll make your code to use your own google account. After that, restart your dev environment to use it. If your code is running in Google Compute Engine (GCE) or Cloud Run, you need to…

    Tags:

  • UMAP vs t-SNE

    t-SNE 는 고차원 공간의 점들을 저차원으로 투영시킬때, 고차원 점들간의 거리가 확률적으로 나왔다고 가정한 뒤 저차원 공간에서 점들간의 거리 역시 확률적으로 나왔다고 가정한 다음 두 확률 분포의 거리를 최대한 일치시킵니다. 결과적으로 먼건 멀게, 가까운건 가깝게, 그러나 hard decision은 아니고 확률적으로 한다는 것. 여기까지는 대충 수식보고 알겠는데 UMAP은 어렵다고 생각하던중 너무나 훌륭한 글이 있네요. https://pair-code.github.io/understanding-umap/ “UMAP constructs…

  • Override mathjax in mdbook

    If your mdbook is using mathjax v2.x (which is the case when downloading mac binary), you can override its font and expression to trigger mathjax. In book.toml, add the followings And in mathjax-config.js (which should be at the top of your project directory and not under src): Change ‘scale’ to adjust font size. Also, see…

    Tags:

  • Better prompt formatting for LLM uses

    One of the typical way to format prompt is using { … } like the below: But it fails as soon as JSON is involved in the input or output. Then, what should we do? My proposal is using string.Template. Its primary purpose is i18n, but it also provides a much more safe formatting, e.g.,…

    Tags:

  • Map reduce pattern in LLM

    One of the primary patterns of using LLM is map reduce. For example, process multiple docs in mappers and then reduce them as a single result in the reducer. LLM mapreduce sounds very intuitive and simple, but in reality it isn’t. One of the problem is hallucinations. When hallucinate, LLM fails unexpectedly in an obvious…

    Tags:

  • Using a static external IP for google cloud run

    Using static ip for google cloud run when connecting to external network is pretty useful if any of your external counterpart is using ip address based authentication, e.g., mongodb. There’s VPC peering to solve this problem, but it takes a good amount of money for small experiments. This document explains how one can use static…

    Tags: