- 빅데이터/인공지능
- 전경헌 사이냅소프트 대표 "국내 1위 문서 서비스에 생성 AI 도입...AI 전문 SaaS 기업으로 도약"
- 오라클, 마이SQL 쓰임새 생성 AI로 확장
- 윈도우 AI 기능에서 발견한 마이크로소프트의 조급함과 성급함
- LongLLaMA Code: Focused Transformer Training for Context Scaling
- Open X-Embodiment: 로봇 학습 데이터세트 및 RT-X 모델 (robotics-transformer-x.github.io)
- “메타 라마2 API로 최초 지원” ···서버리스 생성형 AI 서비스 ‘아마존 베드록’ 공식 버전 출시
- 구글, 창립 25주년 맞아 AI 개발 역사 소개
- 구글 어시스턴트, 더 똑똑해진다…“생성형 AI ‘바드’ 탑재”
- Visual Instruction Tuning - LLaVA: Large Language and Vision Assistant
- “구글 독점적 입지, AI 검색에선 더 우려스럽다” 사티야 나델라 진술
- 하둡 영광 재현될까?··· 야후, AI 데이터 서빙 엔진 ‘베스파’ B2B 사업용 법인 설립
- 몽고DB, AI 애플리케이션 구축 및 확장 위한 ‘아틀라스 벡터 서치’ 신규 기능 발표
- 손정의 회장 "10년 안에 AGI 등장할 것...AI 거부하면 금붕어·원숭이 수준 될 것"
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
- RAG Is A Hack - with Jerry Liu from LlamaIndex
- Evaluating and Optimizing your RAG App
- Everything about Distributed Training and Efficient Finetuning
- Large Language Models (in 2023): 동영상
- Google just dropped their new phone. Prepare to be blown away by its jaw-dropping AI features!
- Vector similarity - Use Redis as a vector database
- SeaGOAT - A code search engine for the AI age. SeaGOAT is a local search tool that leverages vector embeddings to enable to search your codebase semantically.
- The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
- turbopuffer <(°O°)%gt; - A truly serverless vector database. 🐡💨
- Evaluating LLMs is a minefield
- Practical Advice for Retrieval Augmented Generation (RAG), by Sam Partee at QCon San Francisco
- AI 창업가 현황
- Build AI Content Search with Docs Agent
- The Docs Agent demo enables PaLM API users to launch a chat application on a Linux-based host machine using their own set of documents as a source dataset.
- AI 저작권 논란 확산…"국내도 법·제도 개선해야"
- Can large language models provide useful feedback on research papers? A large-scale empirical analysis
- Pymilo is an open source Python package that provides a simple, efficient, and safe way for users to export pre-trained machine learning models in a transparent way.
- "AI 골드러시 시대"…이통 3사도 AI로 탈통신
- Is Generative AI Taking Over the World?
- Today we're announcing #GAIA1: a 9B parameter world model, trained on 4,700 hours of driving data, able to simulate complex and diverse driving scenes from video, text and action inputs.
- We are excited to announce the release of XGBoost 2.0.
- s아직도 덜 올랐다"…인공지능 올라탄 독점 기업
- Meta, 모든 곳에 AI 챗봇을 도입 예정 (theverge.com)
- CNN Explainer - An interactive visualization system designed to help non-experts learn about Convolutional Neural Networks (CNNs)
- Meta, 32k 토큰의 컨텍스트 윈도우를 지원하는 LLAMA 2 Long 공개 (venturebeat.com)
- Open Interpreter lets LLMs run code (Python, Javascript, Shell, and more) locally.
- You can now get a full tracing/observability UI in *all* @llama_index RAG/agent pipelines, in one-line of code ⚡️
- Free Certification Course from IBM: Analyzing Data with Python!
- Learn more about NumPy for #Python
- Chroma embedding_functions.py
- drawing ExpertQA drawing: Expert-Curated Questions and Attributed Answers
- ExpertQA: Expert-Curated Questions and Attributed Answers
- Writing poems using LLama 2 on Workers AI
- Announcing AI Gateway: making AI applications more observable, reliable, and scalable
- Workers AI: serverless GPU-powered inference on Cloudflare’s global network
- JVector is a pure Java embedded vector search engine that powers DataStax Astra and is being added to Apache Cassandra.
- Memory in Plain Sight: A Survey of the Uncanny Resemblances between Diffusion Models and Associative Memories
- THE SCIENTIFIC PYTHON DEVELOPMENT GUIDE
- 4 Controversial data science opinions of mine:
- L2CEval - How good are current LLMs at translating natural language into executable code?
- Falcon 180B Finetuning using 🤗 PEFT and DeepSpeed
- Generative AI의 2막 (sequoiacap.com)
- Multi-Document Agents
- Massive Text Embedding Benchmark
- Massive Text Embedding Benchmark (MTEB) Leaderboard
- WaffleCLIP - This repository contains code to replicate key experiments from our paper Waffling around for Performance: Visual Classification with Random Words and Broad Concepts.
- Detect objects in images This runs entirely in your browser.
- Probability and Statistics 1
- GN⁺: Mistral 7B (mistral.ai)
- Windows Copilot is here!
- PostgreSQL vs Python for data cleaning: A guide
- transformers.js javascript library logo
- GN⁺: RealFill: 확산 모델을 사용한 이미지 완성 (realfill.github.io)
- Everything You Always Wanted To Know About Mathematics* (*But didn’t even know to ask) - A Guided Journey Into the World of Abstract Mathematics and the Writing of Proofs
- @mark_riedl starts with distinguishing transparency, interpretability, and explainability.
- PaLM API: Text Quickstart with Python
- Top Python Libraries for Visualization: A Starting Guide
- Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
- Introducing New AI Experiences Across Our Family of Apps and Devices
- RAG or 파인튜닝? 선택 전 던져야할 몇 가지 질문들
- Processing a 250 TB dataset with Coiled, Dask, and Xarray
- Dataherald - 자연어-to-SQL 엔진 (github.com/Dataherald)
- This is GreenBitAI's research code for running 2-bit and 1-bit LLaMA models with extreme compression yet still strong performance.
- Transformer-VQ - Official implementation of 'Transformer-VQ: Linear-Time Transformers via Vector Quantization'.
- Demystifying CLIP Data
- 안드레 카파시, 거대언어모델은 새 시대의 OS 역할
- Demystifying CLIP Data
- Anthropic (Claude and Claude 2) started rolling out access to Chat and API users all over the world!
- “챗GPT 데이터 유출 공포 이렇게 극복했다”··· 美 나반 CSO 프라바스 카란스
- “실시간 AI 애플리케이션 개발 더 쉽게”··· 컴플루언트, AI용 데이터 스트리밍 기술 강화
- “학습부터 생성까지” AI는 저작권 사각지대
- 데이터 변혁의 핵심은 기술이 아니라 사람이다
- MS, 윈도우 그림판에 AI 이미지 생성 도구 ‘코크리에이터’ 추가
- 오픈AI-MS 연합 따라가나··· 앤트로픽, 아마존으로부터 40억달러 투자 유치 “인프라 사용 권한 획득”
- “윈도우에 AI가 온다” 윈도우 11 23H2 업데이트 미리보기
- Superflows - SaaS를 위한 AI Copilot 작성용 툴킷 (github.com/Superflows-AI)
- Google open-sourcing new SoTA CLIP models 🥳 - mple colab for SigLIP models described in the SigLIP paper.
- 日本語LLMまとめ
- How Generative AI and LLMs Unlock Greater Workforce Productivity
- Effective Long-Context Scaling of Foundation Models
- "의료계에도 부는 AI 바람"…KHF 2023, 헬스케어의 디지털전환 주도
- "민감정보 다루는 공공 초거대 AI, 민간기술 쓰되 폐쇄형 클라우드 구현해야"
- MS·구글과 다르다...'생성AI 유료화' 신중모드 확산
- 아마존헬스케어 "의료 시장 변화 맞아, 클라우드 역할 커져"
- Core ML Stable Diffusion - Run Stable Diffusion on Apple Silicon with Core ML
- Building RAG-based LLM Applications for Production (Part 1)
- Hugging Face's Guide to Optimizing LLMs in Production
- Less than 31 hours since OpenAI started dropping the ChatGPT vision feature on pro users... People are scratching their heads in disbelief.
- Non-engineers guide: Train a LLaMA 2 chatbot
- Fine-Tuning or RAG — Why Not Both?
- Use @llama_index data agents for blockchain data analysis 🔑🐋
- Creating a LLaMa 2 Agent Empowered with Wikipedia Knowledge
- VideoDirectorGPT: Consistent Multi-Scene Video Generation via LLM-Guided Planning
- We release OpenLM a simple and minimalist PyTorch codebase for training medium-sized language models.
- A hacker's guide to Language Models
- vpselector - The Visual Pandas Selector is a tool to visually select portions of numeric time-series data from a pandas dataframe.
- Harnessing AI in product management to build better products
- Lantern - AI앱을 위한 PostgreSQL Vector DB (docs.lantern.dev)
- Building RAG from Scratch (Open-source only!)
- Spotify Embraces AI, Eschews AI Music Ban and Adds AI Podcast Translation
- Question Answering with Langchain, Tair and OpenAI
- Question Answering with Langchain, AnalyticDB and OpenAI
- Open Interpreter with @Gradio demo in colab
- Efficient Post-training Quantization with FP8 Formats
- GN⁺: Google이 Quora의 (틀린) ChatGPT 답변을 정답으로 선택 (twitter.com/8teapi)
- “AI 푸드테크 선도 역할”..풀무원GPT, 헬스케어부터 음성봇까지 확대
- 최대 24배 빠른 vLLM의 비밀 파헤치기
- 자신의 ChatGPT 전체 기록을 Markdown으로 Export하기 (github.com/mohamed-chs)
- QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
- LLM Startup Embraces AMD GPUs, Says ROCm Has ‘Parity’ With Nvidia’s CUDA Platform
- 데이터에 생명을 더하는 일
- Here are the four main ways you can improve your LLM/RAG app beyond the “naive” state (top-k) 📈:
- Textbook Quality - This project generates very long, textbook quality pretraining data.
- 구글, 안드로이드 AI 코딩 보조 프로그램 Studio Bot을 170개국에 공개 (developers-kr.googleblog.com)
- Why "p<0.05" and "p>0.05" Aren't Enough?
- 컴퓨터가 수백만 개 상품을 사람처럼 인식할 수 있을까?
- Generative AI by Getty Images Launches, Trained on NVIDIA Picasso
- tinytorch - Newest ML framework that you propbaly don't need, this is really autograd engine backed by numpy
- GN⁺: Amazon, 최대 5.3조원($4B)을 Anthropic에 투자 예정 (anthropic.com)
- Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
- VidChapters-7M: Video Chapters at Scale
- Document Topic Extraction with Large Language Models (LLM) and the Latent Dirichlet Allocation (LDA) Algorithm
- MS와 인텔의 AI 전략이 ‘PC의 미래’에 시사하는바
- 미 베스트셀러 작가들, 오픈AI 고소…챗GPT 학습에 저작물 무단 이용
- 어설픈 기사로 조롱거리 된 AI 기자 外··· 'AI 재난' 9선
- 생성형 AI 도입이 IT리더에게 고민거리인 이유
- ‘생성형 AI로 매출 증대 꾀한다 ’ 오라클 클라우드월드 2023 요약 6가지
- 춘추전국의 한국 이커머스 시장을 평정한 쿠팡. Gen. AI로 혁신하는 네이버 쇼핑과 다른 기업들의 M&A
- Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond
- Anthropic Cookbook - This cookbook contains recipes in the form of Jupyter notebooks for using Claude in neat and effective ways.
- A team at Microsoft Research has made available Phi 1.5, which is a 1.3 billion parameter model optimized for common sense reasoning in natural language, showing performance on par with models 5x its size, especially in grade-school mathematics and basic coding.
- Open AI releases GPT-4V(ision) system card
- LLM Meetup @Seoul 1: LLM이 해결하지 못한 문제들
- Quickly create interactive & customizable multi-page dashboards with Taipy, the open-source Python library, with its simple syntax.
- ChatGPT can now see, hear, and speak
- MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
- LLama 2 LLM for PDF Invoice Data Extraction
- Interpreting OpenAI's Whisper
- big.js - A small, fast, easy-to-use library for arbitrary-precision decimal arithmetic.
- Neo4j Schema Query Builder
- DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion
- ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
- Reversal Curse - This repository includes the source code and results for an experiment I did to verify the results in this paper, and to test a hypothesis that was immediately obvious to me upon reading the paper, given the intuition and understanding I have built up regarding language models so far. The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
- Medical Intake Experiment - Automated pipeline for medical intake, diagnosis, tests, etc. Meant to be used as a clinical aid.
- Text-Guided Vector Graphics Customization
- PyTorch Model Performance Analysis and Optimization — Part 6
- 데스크 리서치 속도를 5배 향상시키는 ChatGPT활용
- Local Embeddings with HuggingFace
- Artificial Intelligence Programs by Stanford Online
- Why transformative artificial intelligence is really, really hard to achieve
- How the Python Dataframe Interchange Protocol Makes Life Better
- FlagEmbedding can map any text to a low-dimensional dense vector which can be used for tasks like retrieval, classification, clustering, or semantic search. And it also can be used in vector databases for LLMs.
- Why Are There So Many Python Dataframes?
- Vector Search with OpenAI Embeddings: Lucene Is All You Need
- I'll go on the record with perhaps another contrarian opinion: Lucene HNSW is the future and (pure) vector DB vendors are in trouble. Why?
- Introducing Elasticsearch Relevance Engine™ — Advanced search for the AI revolution
- Advertisement 5 Skills All Marketing Analytics and Data Science Pros Need Today
- End-to-end Autonomous Driving: Challenges and Frontiers
- Bohemian Matrices
- GitWit is a container-based agent specialized in making useful commits to git repositories.
- Embedding Similarity Evaluator
- 542.8 TB of high quality text:
- 전재진 카카오엔터프라이즈 클라우드 리더 “멀티모달 개발 경험 충분… 외부 협업 이어갈 것”
- "생성 AI 시대, 관계형 DB가 견고한 기반될 것"
- 오라클, 기업용 생성 AI 플랫폼 서비스 공개
- How AI Is Changing the Way We Code
- Candle Llama2.c - Rust/WASM Demo
- 하드웨어
- 읽을거리
- 2023년 9월말 외환보유액
- 타임 지가 선정한 역대 미스터리 스릴러 100선
- 목차를 디자인하는 방법 - 테스터스 초이스
- 통계청 1년간 뒤집더니…‘표본 조작’ 어디로 사라졌나
- IMF, 인플레 장기화 경고…"5년내 물가 잡힌 경우 60%도 안돼"
- 古書의 명복을 빕니다… 전국 대학 ‘책 장례식’
- 일시적 1가구 2주택 세금별 차이
- 이런 적은 없었으니까
- 연봉 7800만원…‘기재부 피셜’ 중산층·고소득층 가르는 기준?
- 이경규, 본능과 태도
- 日 솜포케어, 요양산업 강자된 비결은… IT기술 접목해 매출 1兆 달성
- 사람들이 네이버·다음에서도 뉴스를 안 본다
- 비만약 ‘위고비’ 열풍에 떨고 있는 회사들
- 대공황 이야기
- 경제금융 공부하기 좋은 6가지 영화
- 시험 등수 없는 요즘 초등학교. 난생처음 성과주의를 경험해 본 아이들의 반응?ㅣ선행학습으로 인한 교실 속 교육격차ㅣ올바른 분배의 기준과 공정이란?ㅣ다큐멘터리K
- Here's 15 Data Analytics Questions Google expects you to know:
- 집에서 샤인머스캣 키우기
- 도로 위 분홍색의 기적
- 대한민국 상위1% 보고서 (NH투자증권)
보너스: In-context learning 개념을 설명하기 어려워서 이런식으로 예시를 들었는데 괜찮은 것 같다 ㅎㅎㅎ
EOB
댓글 없음:
댓글 쓰기