• Flask의 request

    Flask의 request는 PEP 567의 Context Variables을 Werkzeug의 LocalProxy로 구현한 객체입니다. Context Variables은 thread local을 개선합니다. This concept is similar to thread-local storage (TLS), but, unlike TLS, it also allows correctly keeping track of values per asynchronous task, e.g. asyncio.Task. https://peps.python.org/pep-0567/ LocalProxy는 매번 컨텍스트를 ContextVar.get() 을 사용할 필요 없이 마치 로컬 객체처럼 다루게 해줍니다. 아래 코드의 _request와…

    Tags:

  • How to limit or cap google cloud budget

    The documents are there, but a bit hard to comprehend and follow. So I’m writing the steps here. Firstly, create a budget. I am receiving email alerts already when the 50%, 70%, etc. of budget is hit. It can be set up following this. Now you need to shutdown your service if the budget reaches…

    Tags:

  • Foundation of vector retrieval

    Gemeni 가 1M 토큰 컨텍스트를 들고 나왔지만 잘 정리된 논문 같아서 읽어보았습니다. Pinecone의 research scientist 가 작성한 논문 Foundations of Vector Retrieval 입니다. 아래는 공부하면서 정리한 노트입니다. 35/203 Inner product은 데이터 전처리를 해주면 cosine similarity, Euclidean distance 와 같아져 그 둘의 일반화이다. 단 거리로서의 기준인 non negativity, coincidence (자기 자신이 자기 자신과 유사도가 가장 높은…

  • How to install mongodb on crostini (or, debian bookwarm)

    If you come across libssl1.1 issue, there’s an option of installing the library via ubuntu repository OR installing the mongodb itself from ubuntu repository. I personally prefer the latter since I don’t like having a single apt setting just for a single lib. Just follow https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-ubuntu/. If you don’t know if you’re on bookwarm, this…

    Tags:

  • 메트릭에 근간한 개발

    구글이 공유한 머신 러닝의 첫번째 규칙은 “Don’t be afraid to launch a product without machine learning.” 이다. 이를통해 가장 먼저 메트릭을 정의하고, 동작하는 파이프라인을 만들고, 실패하고, 펑가하고, 개선할 수 있기 때문이다. Stripe의 머신 러닝 엔지니어인 Emmanuel Ameisen 역시 “ML is an iterative process where the fastest way to make progress is to see how a…

  • How to reliably use llm to get json outputs

    When using LLM, esp., for getting json output, there are many things that go wrong. I’ll explain some of them in this post. Prompt First is to ask LLM to generate json as output. I use the following: Every LLM generates all the different errors. Tweak the above as necesary. Parsing Even with a strong…

    Tags:

  • VS code git merge appears as rebase

    I was very confused until recently since I didn’t understand that git merge can perform fast-forward. In the log, it looks like git merge is performing rebase. To prevent that, I’m forcing these options: Sadly, those options have no impact on vscode’s “Git merge…” command. One should either use plugins like “Git Graph” or do…

    Tags:

  • .[x]profile, .[x]rc files

    I always forget what those .bash_profile, .bashrc, .zprofile, .zshrc do, so I’m summarizing them here: What does this mean? It’s natural for some code appear in both .profile and .rc, e.g., pyenv recommends it if certain non interactive login shell script relies on the pyenv.

    Tags:

  • How to fix impersonation error in using vertexai

    Somehow the authentication part isn’t easy to find in a single place, so I’m adding it. For local dev, run: That’ll make your code to use your own google account. After that, restart your dev environment to use it. If your code is running in Google Compute Engine (GCE) or Cloud Run, you need to…

    Tags:

  • UMAP vs t-SNE

    t-SNE 는 고차원 공간의 점들을 저차원으로 투영시킬때, 고차원 점들간의 거리가 확률적으로 나왔다고 가정한 뒤 저차원 공간에서 점들간의 거리 역시 확률적으로 나왔다고 가정한 다음 두 확률 분포의 거리를 최대한 일치시킵니다. 결과적으로 먼건 멀게, 가까운건 가깝게, 그러나 hard decision은 아니고 확률적으로 한다는 것. 여기까지는 대충 수식보고 알겠는데 UMAP은 어렵다고 생각하던중 너무나 훌륭한 글이 있네요. https://pair-code.github.io/understanding-umap/ “UMAP constructs…