Google이 오픈 소스 비주얼 언어 모델 'PaliGemma' 공개

Google이 2024년 5월 15일에 비주얼 언어 모델(VLM)인 'PaliGemma'와 대규모 언어 모델(LLM)인 'Gemma 2'를 발표했습니다. PaliGemma는 이미 출시되었으며 쉽게 사용할 수 있는 데모도 공개되었습니다.

Introducing PaliGemma, Gemma 2, and an Upgraded Responsible AI Toolkit - Google Developers Blog

Introducing PaliGemma, Gemma 2, and an Upgraded Responsible AI Toolkit

At Google, we believe in the power of collaboration and open research to drive innovation, and we're grateful to see Gemma embraced by the community with millions of downloads within a few short months of its launch. This enthusiastic response has been inc

developers.googleblog.com

PaliGemma – Google's Cutting-Edge Open Language Model

PaliGemma – Google's Cutting-Edge Open Vision Language Model

PaliGemma – Google's Cutting-Edge Open Vision Language Model PaliGemma is a new family of vision language models from Google. PaliGemma can take in an image and a text and output text. The team at Google has released three types of models: the pretrained

huggingface.co

◆ 비주얼 언어 모델 「PaliGemma」
PaliGemma는 이미지를 인식해 「이미지의 내용을 설명」, 「이미지의 텍스트를 이해」, 「이미지의 오브젝트와 배경을 분리」 하는 조작이 가능한 비주얼 언어 모델입니다.

PaliGemma는 GitHub, Hugging Face, Kaggle, Vertex AI Model Garden에서 사용할 수 있으며, NVIDIA도 자체 GPU에 최적화된 PaliGemma를 개발하고 있습니다. 또한, PaliGemma의 기능을 시험할 수 있는 데모 페이지도 공개되어 있습니다.

PaliGemma Demo - a Hugging Face Space by google
https://huggingface.co/spaces/google/paligemma

PaliGemma Demo - a Hugging Face Space by google

huggingface.co

실제로 위의 데모 페이지에서 PaliGemma의 기능을 사용해 보았습니다.

화제성 최강 민대표님...

인식 실패

역시 구글은 미국 꺼...
그분은 알아먹는군요 ㅋㅋ

◆ 대규모 언어 모델 'Gemma 2'
구글은 2024년 2월 Gemini의 연구 자원을 활용한 오픈 소스 LLM 'Gemma'를 공개하였고, 이번에 새롭게 Gemma의 강화 버전인 Gemma 2를 발표했습니다.

Gemma 2의 파라미터 수는 270억으로, 파라미터 수 700억의 Lama 3 70B에 필적하는 성능을 발휘한다는 것. 또한, Gemma 2는 NVIDIA제 GPU나 Google의 AI 플랫폼 「Vertex AI」 에 최적화되어 있어, 동등 클래스 모델에 비해 절반 미만의 리소스로 동작시킬 수 있다고 합니다.

Gemma 2는 아직 사전교육 중이지만 이미 다양한 벤치마크 테스트에서 Grok을 능가하는 점수를 보여주고 있습니다.

또한 Gemma 2는 향후 몇 주 이내에 공개될 예정입니다.

저작자표시 비영리 변경금지

'AI · 인공지능 > AI 뉴스' 카테고리의 다른 글

애니메이션 특화 데이터 세트 「Sakuga-42M」이 등장 (5)	2024.05.21
GPT-4o의 중국어 토큰은 포르노와 스팸으로 오염됨 (5)	2024.05.21
스탠퍼드대 연구 그룹이 GPU를 고속으로 동작시키는 AI용 도메인 고유 언어 「ThunderKittens」 출시 (73)	2024.05.17
OpenAI가 「GPT-4o(옴니: omni)」 를 발표 (4)	2024.05.17
Google이 영상과 음성을 이해하고 질문에 답하는 AI 에이전트 「Project Astra」 를 발표 (5)	2024.05.16
Google이 학습 진화 AI 모델 「LearnLM」 을 발표 (5)	2024.05.16
Google이 고성능이면서 빠른 경량 AI 모델 「Gemini Flash」 를 발표 (4)	2024.05.16
Google의 AI 사진・영상 검색 기능 「Ask Photos」도입 (4)	2024.05.16

두우우부

Google이 오픈 소스 비주얼 언어 모델 'PaliGemma' 공개

'AI · 인공지능 > AI 뉴스' 카테고리의 다른 글

티스토리툴바

Google이 오픈 소스 비주얼 언어 모델 'PaliGemma' 공개

'AI · 인공지능 > AI 뉴스' 카테고리의 다른 글

관련글

티스토리툴바