MakombiyutaZvirongwa

UTF-8 - unhu yeEncoding

Unicode inotsigira zvinenge zvose zviripo hunhu pazvinhu. The yakanakisisa chimiro yeEncoding Unicode hunhu yakatarwa ndiko UTF-8 yeEncoding. It inotsigira kugarisana pamwe ASCII, kuramba kukanganisa pamusoro data, ari kushanda zvakanaka uye kudekara kugadzira. Asi zvinhu zvokutanga pakutanga.

Coding fomu

Computers anoshanda kwete chete sezvo nhamba tiivistelmä kwemasvomhu zvinhu, uyewo mubatanidzwa vamapoka okuchengetera uye kubata wakaiswa-saizi date - byte uye 32-ikakuruma mashoko. Encoding mureza anofanira ngaafunge izvi pakusarudza sei kupa nhamba kutaurwa.

In makombiyuta, kuti integers mumabhatiri kuyeuka masero 8 matomu (1 byte), 16 kana 32 matomu. Mumwe chimiro rinotsanangura Unicode yeEncoding, izvo kutevedzana okuyeuka masero ane integer chaienzanirana imwe chiratidzo. In mureza kune zvitatu zvakasiyana siyana Coding Unicode vatambi 8, 16 uye 32-ikakuruma zvinogumbura. Nokudaro, vanozivikanwa se UTF-8, UTF-16 uye UTF-32. Name UTF anomirira Unicode Nokuchinja Format. Mumwe nomumwe zvitatu siyana yeEncoding zvinoreva akaenzana mufananidzo Unicode hunhu ane zvakwakanakira mafomu siyana.

Data encryption zvinogona kushandiswa kumirira vatambi vose vari Unicode mureza. Saka, vari zvakakwana enderana kune negadziriso nezvikonzero zvakasiyana-siyana, achishandisa zvakasiyana siyana Coding. Mumwe Coding anogona zvisingarevi zvakawanda mudzoke kunyika chero mamwe maviri pasina kurasikirwa mashoko.

nenalozheniya musimboti

Imwe neimwe siyana Unicode yeEncoding mune maonero asiri kukwana bindepinde. Somuenzaniso, Windows-932 anoumba mabhii mumwe kana vaviri bytes yeMutemo. Kutevedzana kureba kunobva rokutanga byte, saka kutungamirira byte tsika dziri munhevedzano maviri byte uye kuroorwa byte disjoint. Zvisinei, kukosha chete byte uye achitevera byte kutevedzana anogona -enderana. Izvi zvinoreva somuenzaniso kuti unhu kutsvaka D (bumbiro 44) angazviziva chikanganiso vachipinda mugove wechipiri kutevedzana vaviri-byte hunhu "D" (bumbiro 84 44). Kutsvaka kuti urongwa kwakarurama, chirongwa vanofanira kufunga yapfuura bytes.

Mamiriro acho ezvinhu yakaoma, kana vaitungamirira uye achitevera bytes machisa. Izvi zvinoreva kuti, kuti abvise unoreva zvakawanda richava neizvi Lookup asati asvika pakutanga rwezuva kana kuti rakasiyana remitemo kutevedzana. Izvi hazvisi chete va neudhadha, asi haina kudzivirirwa zvichibvira zvikanganiso, sezvo imwe chete isiri byte zvizere chinyorwa yaita unreadable.

Format kutendeuka Unicode haambotauri dambudziko iri nokuti kukosha vaikudzwa, achitevera, uye rimwe chairema okuchengetera hadzisi mashoko chete. Izvi zvinoita kuti Unicode zvose yokutsvaka uye kuenzanisa, haana kupa yakakanganiswa zvabuda nokuda masanga dzakasiyana dzenyika unhu remitemo. Chokwadi kuti marudzi aya Coding kucherechedza musimboti nenalozheniya kusiyana nevamwe East Asia multi-byte encodings.

Chimwe chinhu nonintersection Unicode encodings ndechokuti mumwe hunhu ane muganhu zvakajeka. Izvi hakubvisi kudiwa kuti atarise ane nokusingagumi nhamba dzokutambura zviratidzo. Nyaya iyi unowanzonzi kuzvidzora clocking yeEncoding. Kukanganisa Bumbiro zviumbu achasuma kuti kukanganisa chimiro chimwe chete, uye akapoteredza vatambi vari uchakakwana. Mugore 8-yati youduku kutendeuka, kana Pointer unoratidza byte, kutangira 10xxxxxx (mu binary remitemo) kuwana kutanga chiratidzo inodiwa munhu matatu neizvi kuchinja.

kusachinja-chinja

Unicode rakapiwa rinonyatsotsigirawo zvose 3 mhando encodings. Zvakakosha kwete kushora UTF-8 uye Unicode, sezvo zvose kutendeuka zvinogona - zvakafanana dzinoshanda siyana pachake ari Unicode chimiro-yeEncoding mureza.

Byte-dzidziso

Kumiririra UTF-32 mabhii vanofanira kuva 32-yati Bumbiro Unit, rinobatana pamwe Unicode bumbiro. UTF-16 - munhu maviri 16-ikakuruma dzakabatana. A UTF-8 anoshandisa kusvika 4 bytes.

UTF-8 yeEncoding chakagadzirirwa kuti inowirirana byte vaifarira ASCII anotsanangura hurongwa. Vazhinji aripo Software uye tsika mashoko michina kwenguva refu nguva akavimba mufananidzo vacho kutevedzana kwezviitiko bytes. Multiple protocols kunobva nguva dzose pamusoro ASCII yeEncoding uye anoshandisa kana angasafa inokosha kuzvidzora vatambi. A nzira nyore kuchinja ezvinhu Unicode anogona, uchishandisa 8-ikakuruma Coding kuti anomiririra Unicode vatambi, chero nerechiHebheru ASCII hunhu kana kudzora hunhu. Kuti izvi zvidaro, uye ndicho UTF-8 yeEncoding.

shanduka kureba

UTF-8 - Coding pamusoro shanduka urefu, zvinosanganisira 8-yati okuchengetera zviyero, okumusoro matomu izvo zvinoratidza kuti chikamu kutevedzana kwezviitiko mumwe nomumwe byte ndohwavo. Imwe siyana tsika kuitenga wokutanga che bumbiro kutevedzana, mumwe - inotevera. Izvi zvinopa disjointness yeEncoding.

ASCII

UTF-8 yeEncoding zvizere rinotsigirwa ASCII Codes (0x00-0x7F). Izvi zvinoreva kuti Unicode vatambi U + 0000-U + 007F vari dzikwanise chete byte 0x00-0x7F UTF-8 uye nokudaro akava indistinguishable kubva ASCII. Uyezve, kunzvenga unoreva zvakawanda, mutengo 0x00-0x7F asina kushandiswa Handichamurangariri chete byte kumiririrwa Unicode vatambi. To encode zviratidzo neideograficheskih kunze ASCII, kushandisa kutevedzana kwezviitiko bytes maviri. Symbols siyana U + 0800-U + FFFF vanomirirwa bytes matatu, uye mamwe Codes vanopfuura U + FFFF zvinoda bytes mana.

zvoupenyu chikumbiro

UTF-8 yeEncoding Kazhinji unopiwa zvatinofarira ari HTML dzakati, uye zvakafanana.

XML rava mureza nokutsigira zvizere nokuda UTF-8 yeEncoding wokutanga. Masangano Standards vanotiwo nayo. dambudziko Support ari URL kero kuti akasiyana ASCII-vatambi, kwakagadziriswa apo rakapiwa W3C uye ouinjiniya boka IETF akasvika chibvumirano pamusoro Coding ose URL kero bedzi muna UTF-8.

Kugarisana pamwe ASCII hwezvekukurukurirana yezwi usashandisa itsva. With UTF-8 anoshanda zvikuru chinyorwa vapepeti, kusanganisira JEdit, Emacs, BBEdit, Kuro, uye "nebhuku rokunyorera" ari Windows kushanda hurongwa. No zvako. yeEncoding Unicode haagoni vanozvirova mutsigiri hwemabasa chokushandisa.

Coding anobatsirwei kuti inoitwa kutevedzana kwezviitiko bytes. With UTF-8 tambo zviri nyore kushanda C nemimwe zvirongwa mitauro. Ndiyo chete chimiro yeEncoding, kuti murayiro hakurevi Labels BYTES BOM kana yeEncoding yapinda mu XML.

nokuzvidzora yaunonyanya kushandisa

Mumamiriro ezvinhu inoshandisa 8-zvishoma zviratidzo kubudiswa ichienzaniswa nezvimwe multi-byte hunhu rinovira, UTF-8 ane zvinotevera zvazvakanakira:

  • Wokutanga byte yekubvuma kutevedzana rine ruzivo pamusoro kureba kwayo. Izvi anowedzera kunyatsoshanda yacho zvakananga kutsvaka.
  • Zvakawanda kuwana pakutanga chiratidzo sezvo pokutangira byte ndeyevanyori yakatarwa zvakawanda tsika.
  • No mharadzano byte tsika.

Enzanisa kubatsirwa

UTF-8 yeEncoding ndiyo tsindirana. Asi kana kushandiswa yeEncoding East Asia vatambi (Chinese, Japanese, Korean, Chinese kunyora vachishandisa zviratidzo) rakashandiswa 3-byte sequences. Uyewo UTF-8 yeEncoding muduku kuna mamwe marudzi Coding kugadzira nokukurumidza. A binary kufambira mitsetse unobereka chete mugumisiro sezvo binary kufambira Unicode.

Chimiro yeEncoding vanoronga

Chimiro yeEncoding zano unotora yeEncoding zviratidzo chimiro uye nzira chete byte nzvimbo yekubvuma dzakabatana. Kuti tizive yeEncoding vanoronga Unicode mureza inopa kushandiswa kwokutanga byte murayiro mucherechedzo (BOM, Byte kuti mark).

Kana BOM mu UTF-8 dzinoti Tag iri shoma bedzi panotaurwa kushandisa siyana Coding. Matambudziko Pakusarudza endian UTF-8 vane, sezvo ayo yeEncoding chikwata saizi ndomumwe byte. Muchishandisa BOM kuti chimiro ichi Coding harifaniri kunodiwa kana inokurudzirwa. BOM anogona kuitika mundima kutendeukira nezvimwe codings uchishandisa byte kuti chiratidzo kana siginicha kuti UTF-8 yeEncoding. Ko kutevedzana kwezviitiko 3 bytes gododd BB 16 16 BF 16.

How kugadza UTF-8 yeEncoding

The HTML Coding UTF-8 ari kuiswa pamwe inotevera code:

musoro

Meta http:-equiv = "Content-Type" kugutsikana = "chinyorwa / HTML; charset = utf-8" ˃

In PHP UTF-8 yeEncoding wakaisirwa kushandisa Header () mashandiro pakutanga faira pashure nokuisa goho pamwero ukoshi kukanganisa:

VaF ˂?

error_reporting (-1);

Header ( "Content-Type: chinyorwa / HTML; charset = utf-8 ');

Kubatanidza kuti MySQL Database UTF-8 yeEncoding wakaiswa:

VaF ˂?

mysql_set_charset ( 'utf8');

The CSS-faira yeEncoding ndiyo UTF-8 mabhii raanotaura sezvinotevera:

@charset "utf-8";

Kana iwe kuponesa mafaira pakubhadhara zvose vanosarudza UTF-8 yeEncoding pasina BOM, zvikasadaro nzvimbo hakushandi. Kuti vaite izvi DreamWeave vanofanira kusarudza Muterere chacho "Modifications - Page Properties - Title / Encoding" kuchinja yeEncoding kuti UTF-8. Uchiteverwa reloading peji, bvisa check mark kubva "Connect Unicode siginicha (BOM)» uye kushandisa kuchinja. Kana rugwaro pamusoro peji kana ari Database wakatangwa kumwe Coding, zvakafanira kuzodzosera kupinda kana apiwa encode. Paunoshanda dzose mashoko, iva nechokwadi kuti kushandisa modifier u.

Unogona kuponesa faira mu UTF-8 yeEncoding mu "nebhuku rokunyorera" pamusoro Windows. Mushure kusarudza Muterere chacho "File - Save Kana ..." kuisa zvakakodzera chimiro yeEncoding uye kuponesa faira mu UTF-8.

In rugwaro mupepeti nebhuku rokunyorera ++, kana kuisa vamwe pane UTF-8, nomukombiyuta Muterere chinhu "Exchange kuti UTF-8 pasina BOM» kushandura hunhu uye kunze UTF-8.

hapana zvokuzviita

Munguva kwenyika dzepasi pose, apo miganhu enyika uye mitauro dziri vakadzima, chimiro rinovira kuti vane unhu omunharaunda, vari shoma kushandiswa. Unicode chinhu chimwe hunhu yakatarwa kuti anotsigira localizations zvose. A UTF-8 - muenzaniso kwakakodzera Implementation pamusoro Unicode ndiro:

  • It anotsigira maturusi akasiyana-siyana, kusanganisira kugarisana pamwe ASCII yeEncoding;
  • Zviri husingakwanisi kusarurama mashoko;
  • nyore uye abudirire kurapwa;
  • ndiyo papuratifomu yakazvimirira.

Nokuuya pamusoro UTF-8 gakava pamusoro ipi yeEncoding kana unhu yakatarwa zviri nani, zvinova maturo.

Similar articles

 

 

 

 

Trending Now

 

 

 

 

Newest

Copyright © 2018 sn.birmiss.com. Theme powered by WordPress.