(오늘의 짤방: Just do stuff via @NickADobos)
- 빅데이터/인공지능
- Extracting Training Data from ChatGPT
 - llamafile is the new best way to run a LLM on your own computer
 - Seamless Communication - A significant step towards removing language barriers through expressive, fast and high-quality AI translation
 - DeepMind develops AI that demonstrates social learning capabilities
 - AI and Mass Spying
 - DemoFusion: Democratising High-Resolution Image Generation With No $$$
 - New AI tool aims to democratise high-res image generation
 - Build with Gemini - Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.
 - Introducing the OpenAI Switch Kit: Move from closed to open-source AI in minutes
 - Announcing Purple Llama: Towards open trust and safety in the new world of generative AI
 - When you upload the documents that are central to your projects, NotebookLM instantly becomes an expert in the information that matters most to you.
 - Bash One-Liners for LLMs
 - llamafile lets you distribute and run LLMs with a single file.
 - 7.1 million miles, 3 minor injuries: Waymo’s safety data looks good
 - Talk2Arxiv is an open-source RAG (Retrieval-Augmented Generation) system specially built for academic paper PDFs.
 - GROBID is a machine learning library for extracting, parsing and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications.
 - A Guide to Large Language Model Abstractions
 - "챗GPT에 팁 준다고 했더니"...사소한 프롬프트로 LLM 대답 쉽게 변해
 - 구글 클라우드-허깅페이스, 전략적 파트너십 체결 "생성형 AI 및 ML 개발 가속화"
 - The retina is arguably the most impressive part of the brain.
 - llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
 - 🪆Matryoshka Representation Learning (MRL) 🪆
 - Meet Jan - Bringing AI to your Desktop: Open-source ChatGPT alternative that runs 100% offline on your computer.
 - The 🤗 Hugging Face Hub now natively supports ⚡⚡ WebDataset ⚡⚡
 - AbacusAI's MetaMath-Bagel-DPO-34B sits at the top of HF🤗 LLM leader board for Maths & reasoning!
 - Matryoshka Representation Learning
 - 500개 이상의 LoRA를 만들어본 후, 터득한 비결
 - Using OpenAI text-embedding-3-large and text-embedding-3-small
 - New embedding models and API updates by OpenAI
 - AI 중심 데이터 전략에서 따져봐야 할 4가지
 - “기존 보안 툴로는 잡기 어렵다” 기업이 직면한 LLM 관련 위험 4가지
 - “환각 없이 믿을 수 있는 LLM” 검색 증강 생성(RAG)의 이해
 - 기계에게 ‘코딩’ 가르치기
 - GN⁺: Ollama, Python & JavaScript 라이브러리 공개 (ollama.ai)
 - The "Dense X Retriever" paper shows that it significantly outperforms the traditional chunk-based retriever
 - GN⁺: OpenAI, 주요 문서를 대중에게 공개하겠다는 약속을 조용히 폐기 (wired.com)
 - "생성형 AI, 2024년 전 세계 IT 지출에 큰 영향 끼치지 않는다" 가트너
 - Chatbot vs Medical Student Performance on Free-Response Clinical Reasoning Examinations
 - Most Top News Sites Block AI Bots. Right-Wing Media Welcomes Them
 - Self-Rewarding Language Models
 - RAG app running on Apple Silicon using MLX, just 3 steps
 - The Ollama Python libraries are now available:
 - InternLM-Math - 7B and 20B Chinese and English Math LMs with better than ChatGPT performances.
 - Introducing 'Prompt Engineering with Llama 2' — an interactive guide covering prompt engineering & best practices for developers, researchers & enthusiasts working with large language models.
 - AI for Tracking & Detection
 - SigLIP model pre-trained on WebLi at resolution 256x256.
 - 🚀 Checkout LLM (Large Language Models) related Code on my GitHub
 - Retrieval augmented generation (RAG) example in MLX !
 - Use RAG to build advanced text-to-SQL
 - GitHub Copilot Chat Now Generally Available
 - TensorDict is a pytorch dedicated tensor container.
 - Fooocus is an image generating software (based on Gradio).
 - 파운데이션 모델이란 무엇인가?
 - Exphormer: Scaling transformers for graph-structured data
 - Many cutting-edge computer vision models consist of multiple stages:
 - SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
 - Accelerating Generative AI with PyTorch IV: Seamless M4T, fast
 - Prompt Of The Year: 2023 🌟
 - 무슨 문제 많이 틀리나… 교실서 생성된 학습데이터 사교육에 넘긴다
 - Generative AI Frontline in 2023/2024
 - AI 비용을 매일 $100 에서 $1로 줄인 방법 : GPT4를 이용한 Mixtral 파인튜닝 (twitter.com/wenquai)
 - viser is a library for interactive 3D visualization + Python, inspired by tools like Pangolin, rviz, meshcat, and Gradio.
 - Non-determinism in GPT-4 is caused by Sparse MoE
 - Vertex's advanced image generation capabilities are even better when you ask Gemini to do the prompting.
 - MLX implementation of Mamba 🐍
 - Teaching CS50 with AI
 - Faster RAG re-ranking with ColBERT
 - Unlocking productivity and personalizing learning with AI
 - 법원에 AI가 뜬다…올해 9월 ‘판결문 추천 AI’ 첫 도입
 - GN⁺: WhisperSpeech – Whisper를 역으로 구축한 오픈 소스 음성합성 시스템 (github.com/collabora)
 - 모델을 재현할 수 없다면 그것은 오픈소스가 아니다 (twitter.com/amasad)
 - ActAnywhere: Subject-Aware Video Background Generation
 - QuantumReservoirPy: A Software Package for Time Series Prediction
 - Use RAG to do ArXiv Research 🌐
 - Full Stack Deep Learning - 2022 Course
 - Weaviate is now supported in DSPy 🤌
 - Sentence Embeddings. Cross-encoders and Re-ranking - Deep Dive into Cross-encoders and Re-ranking
 - Here's a MusicGen fine tune: musicgen-songstarter-v0.1 🎶
 - Vision Mamba - Efficient Visual Representation Learning with Bidirectional State Space Model
 - 🌳 Model Family Tree - Automatically calculate the family tree of a given model. It also displays the type of license each model uses (permissive, noncommercial, or unknown)
 - RAGxplorer is an interactive streamlit tool to support the building of Retrieval Augmented Generation (RAG) applications by visualizing document chunks and the queries in the embedding space.
 - Google Announces Video Generation LLM VideoPoet
 - All the Math You Missed: (But Need to Know for Graduate School)
 - LLaMA Board: A One-stop Web UI for Getting Started with LLaMA Factory
 - RAG - Vector Retrieval - A Comprehensive Study
 - DeepSpeed-FastGen: Introducing Mixtral, Phi-2, and Falcon support with major performance and feature enhancements.
 - The 10 types of clustering that all data scientists need to know. Let's dive in:
 - CoPilot for MS Office is pretty neat. Been playing with it for a bit. It's VERY tricky to sign up for though! ($20 / month). You DO NOT need an enterprise account. After a bunch of digging around this is how I enabled it on my personal/family account:
 - THE ELEMENTS OF EUCLID WITH COLOURED DIAGRAMS AND SYMBOLS
 - Self-Driving Cars — Andreas Geiger - Tübingen Machine Learning
 - AI의 근친교배
 - Stability released a 1.6b alternative to TinyLLama & Phi-2
 - Heat.js - Heat Map 시각화 라이브러리 (william-troup.com)
 - Mathematics for Deep Learning 🧑🎓
 - DataTrove is a library to process, filter and deduplicate text data at a very large scale.
 - nanotron - The objective of this library is to provide easy distributed primitives in order to train a variety of models efficiently using 3D parallelism.
 - Privy - 개인보호 중시 코딩 어시스턴트 (github.com/srikanth235)
 - “생성형 AI 공정 학습 여부 가린다” 페어리 트레인드, 인증 프로그램 제공
 - 호주 정부 “AI 관련 의무적 가드레일 검토 중”
 - LLM의 가장 큰 걸림돌 ‘호출 제한’을 해결하는 방법
 - ‘딥페이크, 음성 변조’… 전 세계 선거 앞두고 떠오른 오픈AI의 난제
 - 중국 군사용 AI 개발에 바이두 LLM 활용? ··· 바이두 “협력한 적 없다”
 - "두뇌를 명령이 아닌 대화 모드로 재설정하라"··· MS, 'AI 활용 습관 5가지' 제안
 - 증가하는 AI 데이터세트… 데이터 엔지니어링 및 관리에 미치는 영향
 - 387% faster TinyLlama, 6x faster GGUF conversion
 - ‘방대한 양의 과학 데이터’… NASA의 생성형 AI 도입 여정
 - 오픈AI, 군사적 용도에 대한 AI 기술 사용 약관 삭제해
 - ‘생성형 AI 도입은 원하지만 방법 아는 경영진은 부족’… 보스턴 컨설팅 조사 결과
 - Spotify's Approach to Leverage Recursive Embedding and Clustering to Enhanced Data Explainability
 - Meta is developing open source AGI, says Zuckerberg
 - Meta's AI Initiatives: $20B+ of Investment in GPUs and Data Center Infra
 - Open-source AI looks strong.
 - 5 Hard Truths About Generative AI for Technology Leaders
 - Preference Tuning LLMs with Direct Preference Optimization Methods
 - State-of-the-art Code Generation with AlphaCodium – From Prompt Engineering to Flow Engineering
 - Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
 - VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
 - New short course on LLMOps!
 - Overview of LLMs for Evaluation
 - Fine-grained Hallucination Detection and Editing For Language Models
 - MLX Chat - A hackish frontend for mlx-lm.
 - Tuning Language Models by Proxy
 - Introducing ASPIRE for selective prediction in LLMs
 - 🤯Inverted Whisper == Whisper Speech!
 - AIM: Autoregressive Image Models
 - Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
 - LILO: Learning Interpretable Libraries by Compressing and Documenting Code
 - GN⁺: TinyML: 초저전력 머신러닝 기술 (ikkaro.net)
 - “GPT 스토어에 AI 여자친구 다수” 개장과 동시에 지침 위반 속출
 - AlphaGeometry: An Olympiad-level AI system for geometry
 - OpenAI Changes Its Stance, Announces That It Is Working With the Pentagon
 - WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
 - AlphaGeometry paper explanation.
 - SonicVisionLM: Playing Sound with Vision Language Models
 - VeRA: Vector-based Random Matrix Adaptation
 - AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic
 - Summarizing Post Incident Reviews with GPT-4 = How we use GPT-4 to summarize incident reports.
 - The Faiss library
 - MathVista: Evaluating Math Reasoning in Visual Contexts
 - Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
 - Democratizing LLMs: 4-bit Quantization for Optimal LLM Inference
 - Over the years we have created dozens of Computer Vision tutorials. This repository contains examples and tutorials on using SOTA computer vision models and techniques.
 - MLX Data - MLX Data is a framework agnostic data loading library brought to you by Apple machine learning research.
 - stable-code - Stable Code 3B is a model offering accurate and responsive code completion at a level on par with models such as CodeLLaMA 7B that are 2.5x larger.
 - Stable Code 3B: Coding on the Edge
 - 👉 PyTorch Image Models
 - Moore-AnimateAnyone - 캐릭터 애니메이션을 위한 이미지-to-비디오 합성 기술 (github.com/MooreThreads)
 - Query Pipeline for Advanced Text-to-SQL
 - I'm currently looking into different metrics and frameworks around Retrieval-Augmented Generation (RAG) evaluation.
 - A breakdown of the Long Context Retrieval Embedding Models from Stanford!💥
 - codeqai - Search your codebase semantically or chat with it from cli.
 - MLSys Seminars - Stanford MLSys Seminars
 - Rethinking Tabular Data Understanding with Large Language Models
 - Mix-Self-Consistency Pack - LLMs can reason over tabular data
 - PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
 - SD Turbo + TAESD on WebGPU with diffusers.js
 - Towards Conversational Diagnostic AI
 - Direct Preference Optimization: Your Language Model is Secretly a Reward Model
 - Listening with LLM
 - Texify - Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
 - Image-to-LaTeX in 3 lines of JavaScript code, with 🤗 Transformers.js!
 - Signs and Portents - Some hints about what the next year of AI looks like
 - "Machine Learning 4771" course materials from @Columbia are open.
 - 며칠 전 구글 리서치에서도 ‘LLM의 실수 식별 및 수정 가능 여부’에 대한 연구 내용을 공개했습니다.
 - LLM의 실수 식별 및 수정 가능 여부
 - A new way to search with generative AI - Get AI-powered overviews and ask follow ups, right in Search.
 - Vanna is an MIT-licensed open-source Python RAG (Retrieval-Augmented Generation) framework for SQL generation and related functionality.
 - Building a fully local LLM voice assistant to control my smart home
 - GN⁺: 스마트 홈 제어를 위한 완전 로컬 LLM 음성 비서 구축하기 (johnthenerd.com)
 - Generative AI: a game-changer that society and industry need to be ready for
 - INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning
 - AI Voice Assistant: Enhancing Accessibility in AI with LlamaIndex and GPT3.5 (Deployed in Prod on Vercel and Render)
 - Generative AI in the Enterprise
 - Failure Points In RAG Systems
 - Deep Learning in Computer Vision with Prof. Kosta Derpanis (York University)
 - EECS 4422 Computer Vision with Prof. Kosta Derpanis
 - Sam Altman reveals in an interview with Bill Gates (2 days ago) what's coming up in GPT-4.5 (or GPT-5):
 - Retrieval meets Long Context Large Language Models
 - RAG (Retrieval augmentation generation) or longer context window LLM - what performs better? 🤔 -
 - Deep Learning for Computer Vision Michigan Online
 - (Long)LLMLingua: Enhancing Large Language Model Inference via Prompt Compression
 - The Llama Hitchiking Guide to Local LLMs
 - The AI Hierarchy of Needs
 - AMIE: A research AI system for diagnostic medical reasoning and conversations
 
 - 하드웨어
- SK하이닉스 흑자 전환… ‘AI로 인한 디램 수요 증가 시사’
 - “전 세계 데이터센터 에너지 소비량…2년 내 1,000테라와트시 육박”
 - 라즈베리 파이 프로젝트에 가능성을 추가하는 8가지 확장 보드
 - 생성형 AI의 신뢰성을 높이는 3가지 기술
 - 애플 비전 프로를 사야 하는 5가지 이유
 - 이사오면서 책장을 싹 없애버렸습니다. 아이패드 이러려고 쓰는거죠? (북스캐너 캐논 R40)
 - 수명은 1만년, 전기도 필요 없다…'영화 1750편' 들어가는 유리조각
 - RAG-Survey - RAG를 잘 정리한 깃헙 문서
 - 해외 대형 출판사들, “AI 번역금지” 국내 출판사에 계약 요구
 - SCEPTER is an open-source code repository dedicated to generative training, fine-tuning, and inference, encompassing a suite of downstream tasks such as image generation, transfer, editing.
 - "2024년 XR 헤드셋 출하, 두 자릿수 성장··· 애플 비전 프로가 활성화 촉진"
 - HHKB 스튜디오 리뷰 : 토프레를 벗어난 해피해킹의 모험
 - “65% 더 많은 에너지 저장한다” 실리콘 음극재 사용한 배터리 등장
 - QA는 생성형 AI 테스트 도구를 어떻게 검증할까?
 - ‘10년 주기’ 반도체 시장 반등 가시화… “특정 제품 아닌 AI 개념이 성장 주도”
 - 허리띠 졸라매는 IT업계, "그래도 AI 서버는 사고본다"…글로벌 서버 기업 고공행진
 - CES 2024를 뒤덮은 ‘AI PC’는 소비자에게 무엇을 의미할까?
 - 삼성 갤럭시 S24 시리즈, 구글 생성 AI '제미나이 프로'와 딥마인드 AI 이미지 생성 툴 '이마젠 2' 탑재
 - Comparing the 1970’s Cray-1 supercomputer against the Raspberry Pi single-board computer range #RaspberryPi @hacksterio
 - THE VACUUM TUBE’S FORGOTTEN RIVAL
 - GN⁺: USB4를 이용하여 고속 10Gbps 풀-메쉬 네트워크 $47.98에 구축하기 (fangpenlin.com)
 - LTO(Linear Tape-Open) 드라이브
 - Running Local LLMs and VLMs on the Raspberry Pi
 - R1 - 자체 파운데이션 모델이 내장된 손바닥 크기의 디바이스 (youtube.com)
 - Struggling with Machine Learning algorithms? 🤖
 - Apple Vision Pro 코딩 시 개발자가 알아야 할 것들 (zdnet.com)
 - MediaTek to adopt Nvidia GPU in flagship mobile chip
 
 - 읽을거리
- Purchasing power parities (PPP) - Total, National currency units/US dollar, 2000 – 2022
 - 대학생을 위한 실용금융(제3판)
 - 도서 출판의 미래 - 2024년과 그 이후 (linkedin.com)
 - 통수에 통수에 통수…‘든든한 아침’ 시리얼이 알고보니 막장드라마
 - 전기차 시대보다 생산 절벽의 시대가 먼저 도착한다면?
 - "주모가 차린 술·밥 먹고 공짜로 자고 가시게"...찜질방의 원조, '주막'
 - 하네다 JAL 516편 충돌 사고의 불편한 진실
 - 회사 실적에 악영항 줄 정도… 허츠, 전기차 2만 대 처분
 - 스콧 갤러웨이의 2024년 예상 (medium.com/@profgalloway)
 - 심화수학 빠지면 사교육비 줄어들까
 - Laser mapping reveals oldest Amazonian cities, built 2500 years ago
 
 
(보너스: The best math joke ever. 😂 via @mathladyhazel)
EOB

댓글 없음:
댓글 쓰기