Home » இணையம் » தெலுங்கு யுனிக்கோடில் ழ & ற - விவாதம்

Friday, May 1, 2020

தெலுங்கு யுனிக்கோடில் ழ & ற - விவாதம்

Info Post

10:35 AM

தெலுங்கு யுனிக்கோட் தொகுதியில் (தெலுங்கு நெடுங்கணக்கில் இல்லை) தமிழ் எழுத்துக்களான 'ழ' மற்றும் 'ற' சேர்க்கப்பட்டுள்ளன. இந்த அறிவிப்பு வெளிவந்தவுடன் இணையத்தில் பலர் எதிர்த்தும், பலர் குழம்பியும் விவாதம் செய்யத் தொடங்கியுள்ளனர். எதுவாகினும் அறிவார்ந்த விவாதங்களே வளர்ச்சியின் அடையாளம். நேரடியாகத் தமிழ் யுனிக்கோடிற்குத் தொடர்பில்லை என்றாலும் விவாதப் பொருளானதால் இதுகுறித்த கருத்துப் பகிர்வு பலருக்கு உதவலாம். இதை எதிர்ப்பவர்கள் சொல்லும் சில குற்றச்சாட்டிகள் அடிப்படை அற்றவையாக இருந்த போதும் பல கேள்விகள் நியாயமானவை. இந்த விசயத்தில் ஆதரிக்கவோ எதிர்க்கவோ தேவையிருப்பதாகத் தெரியவில்லை. ஒரு பார்வையாளனாக இது குறித்த சாராசரி கேள்விகளை திரு. வினோத்திடம் கேட்டேன். அவரின் பதில்கள் கீழே உள்ளன. அதற்கு முன்னர் அந்த முன்வரைவைப் படித்துக் கொள்ளலாம்.

https://www.unicode.org/L2/L2020/20119-two-telugu-letters.pdf

A conversation with proposal submitter on Telugu Unicode block allocation for two Tamil characters

What is the purpose of proposing Tamil characters(ழ,ற) into Telugu blocks?

To digitize Tamil books written in Telugu script as they are.

Why can't we use existing code points 0C5A & 0C34 instead of new?

Nope. Those documents use Telugu RRA and Tamil RRA at the same time.

And Unicode doesn't allow unifying two different looking characters. There is something called a 'graphemic' identity. Unicode encodes characters, not pronunciations.

There is a reason there is Malayalam Letter II and Malayalam Letter Archaic II. Both are for writing 'ii', but one is an archaic sign and the other one is modern. Since they look drastically different, you need two different code points for them.

Why can't you use different fonts or private blocks for digitalization instead of unicode block allocation?

Script rendering doesn't work that way. If you read the proposal, one of the options was to re-use the Tamil characters, but that would mean the font would never be rendered properly.

And it's within Unicode policy, to usually encoded borrowed characters as native characters.

See: Latin letter Omega or Latin letter Chi (These are greek characters borrowed into Latin)

Any historical evidence for usage of these 2 characters other than religious context in Telugu works?

Tamil verses to be recited in Telugu still use these characters. There is enough evidence for them in the proposal.

what is the historical relation of ழ or ற with Telugu language?

ழ was in Old Telugu but was lost in modern Telugu. And hence, the Tamil letter were borrowed into Telugu.

Do you have any support from Telugu community/organization on this proposal?

Why? A language or script does not belong to anyone. They are needed for digitizing existing books and that is enough to propose them. It's not that I invented something ex nihilo.

Latin has a few hundred obscure characters, Devanagari even more. Most of them are historical. A script always has a broader scope than the language. Telugu as such is cosmopolitan (like English. It is open to influences and is never shy to embrace change and, hence, it survives even after a dozen generations in Tamil Nadu). It has been used to write a number of languages other than Telugu: Namely, Tamil and Urdu. Even minority Dravidian languages like Kolami.

There are at least a couple of dozen books that need to be digitized. If you can tell me how to digitize them preserving the graphemic identities and linguistic cross-cultural influence in plain-text, do let me know. I'm all ears.

Can you elaborate any such inter language scripts allocations in Indic unicode blocks?

Gujarati has consonants to write Avestan.

Lao pretty much invented consonants out of nowhere to write Buddhist Pali in 1930s.

Syriac has borrowed letters to write ழ ள ற ன.

Arabic has created letters to write Tamil ழ, ள ற ன (It's called Arwi : Arabic-Tamil).

Linguistic borrowing is a common thing. When cultures interact, they inevitably merge and borrow.

Zha is special for Tamil character, why should we offer to other language?

Nope, it is not.

See above. As of now, I can write ழ in Malayalam, Kannada, Telugu, Arabic, and Syriac.

Do not forget that Chinese also has the same phoneme.

In terms of phonology, it is called a retroflex approximate and is found in at least a dozen languages.

https://en.wikipedia.org/wiki/Voiced_retroflex_approximant

Can same happen if some one propose to add telugu script to Tamil script?

If there are borrowings, and it is so. You cannot change history and erase cultural interaction.

As I said, when two cultures interact, they inevitably mix. There is a reason why Yiddish, Syrio-Malayalam, and Arabic-Tamil exists.

This is imposing Tamil characters into Telugu characters

Latin has hundreds of characters. Most of them archaic, and only used to digitize old books. This includes characters borrowed from Greek and Cyrillic.

A script is always much more alive and elastic than the language itself and is a separate entity.

Telugu written in the Telugu script or the Tamil script or the Kadamba script still remains Telugu.

Tamil written in the Tamil script or the Telugu script or the Arabic script or even Brahmi or Vatteluttu still remains Tamil.

Script and Languages do not have a one-to-one connection. They are separate entities.

Why to spoil the uniqueness of both language?

How does adding two additional characters spoil anything? Does it stop people from doing anything that can do today?

Nope. They're gonna lay dormant. Like hundreds and thousands of Unicode characters and be used by some poor librarian or philology researcher, who wants to study how Tamil and Telugu interact and digitize the texts.

If you're not a philology researcher, it's not need for you.

For instance, if you are not an old English researcher, you don't need ſ (long s). You can happily write /sun/ without ever using the character, but again some poor philologist out there needs to differentiate sun & ſun. They need that. It is relevant to them. For others, it is irrelevant and doesn't affect them in any way.

These are just that. Characters for research, philology and digitization.

Unicode is full of old and archaic characters. If I'm forced to use every one of them, then is it imposing. The last time I checked, no one is doing that :)

7 comments:

Kannabiran, Ravi Shankar (KRS)May 1, 2020 at 6:04 PM
ஈரத் தமிழ் திகழ் வணக்கம்!

நீங்கள் ஒரு பார்வையாளர் என்று தானே நீங்களே சொல்கிறீர்கள்?
துறை சார் அறிஞராக இலாத போது,
எதைக் கொண்டு /இதை எதிர்ப்பவர்கள் சொல்லும் சில குற்றச்சாட்டிகள் அடிப்படை அற்றவையாக/ என்று முடிவு கட்டினீர்கள் ஐயா?

மேலும் /இந்த விசயத்தில் ஆதரிக்கவோ எதிர்க்கவோ தேவையிருப்பதாகத் தெரியவில்லை/ என்று
துறைசார் வல்லுநராக இல்லாமல், ஒரு பார்வையாளராகவே, எப்படி இது போல் முன் முடிவுக்கு உங்களால் வர இயன்றது?

நீங்களும், திரு. வினோத் அவர்களும் கூறிய கூற்றுகளைத் தக்க தரவுகளுடன்,
துறை சார்ந்த வல்லுநராக, ஒரு பல்கலைக்கழகப் பேராசிரியராக, இப் பதிவிலே அடியேன் மறுக்கலாமா?
ReplyDelete
Replies
Kannabiran, Ravi Shankar (KRS)May 1, 2020 at 6:40 PM
/Do you have any support from Telugu community/organization on this proposal?
Why? A language or script does not belong to anyone/

துவக்கமே அறப் பிழை!:(
இதைத் தான் 'எதேச்சாதிகாரம்' என்பார்கள்!
மொழி.. யாருக்கும் உரிமை இல்லை என்று சொல்ல, உங்கட்கு ஏது உரிமை?

ஒரு மொழி, அதன் மக்களின் சொத்து!
அம் மொழியின் வரிவடிவத் தொகுப்பிலே (அன்றாடப் பயன்பாட்டுத் தொகுப்போ/ ஆய்வுத் தொகுப்போ) அம்மக்களுக்கு உரிமையுண்டு!
இம் மொழியின் ஆதி இலக்கணத் தந்தையான தொல்காப்பியரே, தானே வகுப்பதாகச் சொல்லாது,
"என்மனார் புலவர், என்மனார் புலவர்" என்று தான், மொழியின் இலக்கணக் கோட்பாட்டை, மக்கள் நெகிழ்வோடு உரைக்கின்றார்!

/A language or script does not belong to anyone
They're gonna lay dormant.
For others, it is irrelevant and doesn't affect them in any way/
என்றெல்லாம் காரணம் காட்டி,
ஒரு மொழியின் தொகுப்பிலே பல்லாயிரக் கணக்கான dormant வரிவடிவ எழுத்துக்களை யாரும் கொட்டி விடலாமே?

மாண்டரின் சீன மொழியின் நுட்ப ஒலிகளை, ஆய்வாளர் என்ற முறையிலே தமிழில் பயன்படுத்த வேண்டி,
நானும் 1000 கணக்கான சீன வரிவடிவங்களை, Dormant என்ற பேரிலே தமிழில் இறக்கினால், இதற்கு எங்கே தான் முடிவு?
妈/媽 mā, 麻 má, 马/馬 mǎ, 骂/罵 mà, 吗/嗎 ma
இத்தனை ம-க்களை, தமிழின் ஒரே ம-வால் கையாள முடியவில்லை என்று கதைகட்டி,
நானும் Wholesale Importation செய்தால், மொழி தாங்குமா? யோசித்துப் பாருங்கள்!

As an academician & researcher, I wish to document Mandarin Chinese religious poetry in English.
But I am falling short of orthography to reproduce them in English.
So, can I introduce wholesale importation of all these Mandarin Characters into English? Will English Academia accept it?

I do understand the need for niche orthography in Unicode, for transcribing religious texts across languages.
But just for transcribing, one cannot do a 'blind import' of consonants into another language,thereby affecting the substratum of that language and its people.

Alternate options do exist to transcribe Tamil ழ & ற to Telugu,
via LLLA & RRA in Latin script, or use International Phonetic Alphabet (IPA).
Some poor librarian or philology researcher can use IPA or alternate methods to transcribe non-native characters.

What if, all the poor librarians of all the languages of the world, start blind importing 1000s & 1000s of their Consonants into Tamil?
Is this fair to coerce the Language & its millions of People, just to please a handful of librarians & researchers?
An esteemed body like Unicode Consortium should not burden millions of native speakers for the sake of a few religious transcribers.
ReplyDelete
Replies
Kannabiran, Ravi Shankar (KRS)May 1, 2020 at 6:53 PM
/Arabic has created letters to write Tamil ழ, ள ற ன (It's called Arwi : Arabic-Tamil)/

இப்படியெல்லாம் மேம்போக்காக அடிச்சிவிடக் கூடாது அய்யா!
Arwi என்பது, இசுலாமியத் தமிழ் மக்களின் பயன்பாட்டுக்காக உருவாக்கப்பட்ட ஒரு வரிவடிவம்.
யாரோ ஓரிரு நூலகர்களுக்காக அல்ல! கணிசமான பொதுமக்களுக்காக!
அரபி மொழியில் உள்ள மதநூல்களைச் சிக்கல் இல்லாமல், தமிழ் இசுலாமியர்கள் பயிலும் பொருட்டு!

Arwi-இல் யாரும் ழ/ற வடிவங்களை Wholesale Importation செய்யவில்லை!
மாறாக, வலிமிருந்து இடமாக, அரபி போலவே தான் புது எழுத்துக்களை உருவாக்கியுள்ளார்கள்.
ழ-கரம், ۻ என்று தான் உள்ளது, Arwi-இல்!
யாரும் தமிழ் எழுத்துக்களை அப்படியே இறக்கிவிடவில்லை, அங்கு!
அ-ஆ-இ-ஈ முதற்கொண்டு எல்லா ஒலிப்பு/எழுத்துக்களுக்கும் தனித்தனி வரிவடிவம் (ی-اِ-آ-اَ) உள்ளது Arwi-இல்!
Please be advised that there is NO wholesale importation of consonants.
ReplyDelete
Replies
Kannabiran, Ravi Shankar (KRS)May 1, 2020 at 7:06 PM
Again, Gujarati is the traditional language of the Indian Zoroastrians.
It is not for a handful of librarians, but for its people!
Indian Zoroastrians transcribe Avestan in Nagri script-based scripts or Gujarati script, without any 'blind wholesale importation' of Avestan Characters.
Avestan letters with no corresponding symbol in Gujarati are synthesized only with additional diacritical marks, not changing/importing characters blindly.

Linguistic borrowing is a common thing only when cultures interact, NOT handful of librarians interact.
Culture is an embrace by the people, not just the whims and fancies of a few lone people and their transcribing interests.

Respect the People, for it's they, who own & nurture the Language!

ReplyDelete
Replies
Kannabiran, Ravi Shankar (KRS)May 2, 2020 at 8:13 AM
தமிழ்ப் பாசுரங்களைத் தெலுங்கில் எழுத வசதியாக என்று சொல்லித் தானே இத்தனையும் செய்கிறீர்கள்?
ஆனால் பாருங்கள், இப்பதிவின் இத்தனை உரையாடலில்.. மருந்துக்கும் தமிழ் இல்லை!:) அத்தனையும் ஆங்கிலத்தில்!

இதான் poor librarian-களின் புரிந்துணர்வு!
தங்களின் விருப்ப விளையாட்டுகளை விளையாடிக் கொள்ள..
கண்ட கண்ட வரிவடிவங்களை உள்ளிறக்கி,
Dormant தானே, இருந்து கொண்டு போகட்டும்! என்று சொல்ல..
மொழி என்ன, உங்கள் வீட்டுக் குப்பைத் தொட்டியா?:((((( அந்தோ!
ReplyDelete
Replies
டி.என்.முரளிதரன் -மூங்கில் காற்று May 6, 2020 at 7:08 PM
இதனை தெலுங்கு மொழிஅறிஞர்கள் ஏற்றுக் கொள்கிறார்களா? அவர்கள்தான் முக்கியமாக எதிர்க்க வேண்டும். இதில் உறுத்தக்கூடிய விஷயம் என்னவெனில் தனிநபர்களின் கருத்தை யூனிகோடு கன்சார்டியம் ஏற்றுக் கொள்வது. அது தவறு. ஒவ்வொருவரும் தனிப்பட்ட முறையில் மாற்றங்களை செய்ய ஆரம்பித்தால் கேலிக் கூத்தாக ஆகிவிடும். சார்ந்தமொழி பேசும் மக்களின் அரசோ அல்லது மொழி சார்ந்த அங்கீகார அமைப்பின் கோரிக்கையின் பேரில்தான் இந்த மாற்றங்கள் செய்ய வேண்டும். இந்த மாற்றத்தை சார்ந்த மொழிபேசும் மக்களும் அதன் அங்கீகார அமைப்பின் கவனத்துக்கு கொண்டுசென்றால் அவர்களே எதிர்ப்பார்கள் என்றே நம்புகிறேன். அவர்கள் ஏற்றுக் கொள்வார்களாயின் ஒன்றும் செய்ய இயலாது. மொழிமீதுபற்று உள்ளவர் யாராக இருப்பினும் தன்மொழியில் பிற மொழி எழுத்து திணிக்கப் படுவதை ஏற்றுக் கொள்ள மாட்டார்கள் என்றே நம்புகிறேன். வினோத்ராஜனுக்கு ஆந்திர மக்களின் எதிர்ப்பு நிச்சயம் உருவாகும்.
ReplyDelete
Replies

Add comment