Cautionary statements or disclaimers in corporate annual reports need to be carefully designed because clear cautionary statements may protect a company in the case of legal disputes and may undermine positive impressions. This study compares the language of cautionary statements using two corpora, Sony’s cautionary statement corpus (S-corpus) and Panasonic’s cautionary statement corpus (P-corpus), illustrating the differences and similarities in relation to the use of meaningful cautionary statements and critically analyzing why practitioners use the way. The findings describe the distinct differences between the two companies in the presentation of the risk factors and the way how they make the statements. The word ability is used more for legal protection in S-corpus whereas the word possibility is used more to convey a better impression in P-corpus. The main similarities are identified in the use of lexical words and pronouns, and almost the same wordings for eight years. The findings show how they make the statements unique to the company in the presentation of risk factors, and the characteristics of specific genre of professional communication. Important implications of this study are that more comprehensive approach can be applied in other contexts, and be used by companies to reflect upon their cautionary statements.
This paper investigated the code-mixing features in Mandarin-English bilingual children in Singapore. First, it examined whether the code-mixing rate was different in Mandarin Chinese and English contexts. Second, it explored the syntactic categories of code-mixing in Singapore bilingual children. Moreover, this study investigated whether morphological information was preserved when inserting syntactic components into the matrix language. Data are derived from the Singapore Bilingual Corpus, in which the recordings and transcriptions of sixty English-Mandarin 5-to-6-year-old children were preserved for analysis. Results indicated that the rate of code-mixing was asymmetrical in the two language contexts, with the rate being significantly higher in the Mandarin context than that in the English context. The asymmetry is related to language dominance in that children are more likely to code-mix when using their nondominant language. Concerning the syntactic categories of code-mixing words in the Singaporean bilingual children, we found that noun-mixing, verb-mixing, and adjective-mixing are the three most frequently used categories in code-mixing in the Mandarin context. This pattern mirrors the syntactic categories of code-mixing in the Cantonese context in Cantonese-English bilingual children, and the general trend observed in lexical borrowing. Third, our results also indicated that English vocabularies that carry morphological information are embedded in bare forms in the Mandarin context. These findings shed light upon how bilingual children take advantage of the two languages in mixed utterances in a bilingual environment.
This paper focuses on how the government-led language policies and the political changes in Taiwan manipulate the languages choice in translations and what translation strategies are employed by the translator to show his or her language ideology behind the power struggles and decision-making. Therefore, framed by Lefevere’s theoretical concept of translating as rewriting, and carried out a diachronic and chronological study, this paper specifically sets out to investigate the language ideology and translator’s idiolect of Chinese language translations of Anglo-American novels. The examples drawn to explore these issues were taken from different versions of Chinese renditions of Mark Twain’s English-language novel The Adventures of Huckleberry Finn in which there are several different dialogues originally written in the colloquial language and dialect used in the American state of Mississippi and reproduced in Mark Twain’s works. Also, adapted corpus methodology, many examples are extracted as instances from the translated texts and source text, to illuminate how the translators in Taiwan deal with the dialectal features encoded in Twain’s works, and how different versions of Chinese translations are employed by Taiwanese translators to confirm the language polices and to express their language identity textually in different periods of the past five decades, from the 1960s onward. The finding of this study suggests that the use of Taiwanese dialect and language patterns in translations does relate to the movement of the mother-tongue language and language ideology of the translator as well as to the issue of language identity raised in the island of Taiwan. Furthermore, this study confirms that the change of political power in Taiwan does bring significantly impact in language policy-- assimilationism, pluralism or multiculturalism, which also makes Taiwan from a monolingual to multilingual society, where the language ideology and identity can be revealed not only in people’s daily communication but also in written translations.
The process of the translation is not merely the linguistic aspects. It is also considered in the cultural framework of both the source and target text cultures. The translation process and translated texts are confronted the new aspect in 20th century which is considered mostly in the patronage framework and ideological grillwork of the target language. To have these factors scrutinized in the process of the translation both micro-element factors and macro-element factors can be taken into consideration. For the purpose of this study through a qualitative type of research based on critical discourse analysis approach, the case study of the novel “1984” written by George Orwell was chosen as the corpus of the study to have the contrastive analysis by its Persian translated texts. Results of the study revealed some distortions embedded in the target texts which were overshadowed by ideological aspect and patronage network. The outcomes of the manipulated terms were different in various categories which revealed the manipulation aspects in the texts translated.
The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.
It is acknowledged that small and medium enterprises (SMEs) may encounter different ethical issues and pressures that could affect the way in which they strategize or make decisions concerning the outcome of their business. Therefore, this research aimed at assessing entrepreneurial ethics in the business of SMEs in Nigeria. Secondary data were adopted as source of corpus for the analysis. The findings conclude that a sound entrepreneurial ethics system has a significant effect on the level of performance of SMEs in Nigeria. The Nigerian Government needs to provide both guiding and physical structures; as well as learning systems that could inculcate these entrepreneurial ethics.
The present study addressed the nature of bilingual semantic processing in Mandarin Chinese and Southern Min and examined category effects and age effects. Nineteen bilingual adults of Mandarin Chinese and Southern Min, nine monolingual seniors of Mandarin Chinese, and ten monolingual seniors of Southern Min in Taiwan individually completed two semantic tasks: Picture naming and category fluency tasks. The instruments for the naming task were sixty black-and-white pictures, including thirty-five object pictures and twenty-five action pictures. The category fluency task also consisted of two semantic categories – objects (or nouns) and actions (or verbs). The reaction time for each picture/question was additionally calculated and analyzed. Oral productions in Mandarin Chinese and in Southern Min were compared and discussed to examine the category effects and age effects. The results of the category fluency task indicated that the content of information of these seniors was comparatively deteriorated, and thus they produced a smaller number of semantic-lexical items. Significant group differences were also found in the reaction time results. Category effects were significant for both adults and seniors in the semantic fluency task. The findings of the present study will help characterize the nature of the bilingual semantic processing of adults and seniors, and contribute to the fields of contrastive and corpus linguistics.
In a language the inventory of greetings is dynamic with frequent input and output, although this is hardly noticed by the speakers. In this register, there are a number of constant, conservative elements that survive different language models (among them, the classic formulae: bună ziua! (good afternoon!), bună seara! (good evening!), noapte bună! (good night!), la revedere! (goodbye!) and a number of items that fail to pass the test of time, according to language use at a time (ciao!, pa!, bai!). The source of innovation depends both of internal factors (contraction, conversion, combination of classic formulae of greetings), and of external ones (borrowings and calques). Their use imposes their frequencies at once, namely the elimination of the use of others. This paper presents a sociolinguistic approach of contemporary Romanian greetings, based on prosodic surveys in two research projects: AMPRom, and SoRoEs. Romanian language presents a rich inventory of questions (especially partial interrogatives questions/WH-Q) which are used as greetings, alone or, more commonly accompanying a proper greeting. The representative of the typical formulae is Ce mai faci? (How are you?), which, unlike its English counterpart How do you do?, has not become a stereotype, but retains an obvious emotional impact, while serving as a mark of sociolinguistic group. The analyzed corpus consists of structures containing greetings recorded in the main Romanian cultural (urban) centers. From the methodological point of view, the acoustic analysis of the recorded data is performed using software tools (GoldWave, Praat), identifying intonation patterns related to three sociolinguistics variables: age, sex and level of education. The intonation patterns of the analyzed statements are at the interface between partial questions and typical greetings.
The aim of this paper is to examine and identify the issue of linguistic redundancy in two competing grammars of Malay, namely the school grammar and the corpus grammar. The former is a normative grammar which is formally and prescriptively taught in the classroom, whereas the latter is a descriptive grammar that is informally acquired and mastered by the students as native speakers of the language outside the classroom. Corpus grammar is depicted based on its actual used in natural occurring texts, as attested in the corpus. It is observed that the grammar taught in schools is incompatible with the grammar used in the corpus. For instance, a noun phrase containing nominal reduplicated form which denotes plurality (i.e. murid-murid ‘students’ which is derived from murid ‘student’) and a modifier categorized as quantifiers (i.e. semua ‘all’, seluruh ‘entire’, and kebanyakan ‘most’) is not acceptable in the school grammar because the formation (i.e. semua murid-murid ‘all the students’ kebanyakan pelajar-pelajar ‘most of the students’) is claimed to be redundant, and redundancy is prohibited in the grammar. Redundancy is generally construed as the property of speech and language by which more information is provided than is precisely required for the message to be understood, so that, if some information is omitted, the remaining information will still be sufficient for the message to be comprehended. Thus, the correct construction to be used is strictly the reduplicated form (i.e. murid-murid ‘students’) or the quantifier plus the root (i.e. semua murid ‘all the students’) with the intention that the grammatical meaning of plural is not repeated. Nevertheless, the so-called redundant form (i.e. kebanyakan pelajar-pelajar ‘most of the students’) is frequently used in the corpus grammar. This study shows that there are a number of redundant forms occur in the morphology of the language, particularly in affixation, reduplication and combination of both. Apparently, the so-called redundancy has grammatical and socio-cultural functions in communication that is to give emphasis and to stress the importance of the information delivered by the speakers or writers.
Increasing prevalence of childhood obesity has increased the interest in early and late indicators of gaining weight. Cell blood counts may be indicators of pro-inflammatory states. The aim was to evaluate associations of hematological parameters, including hematocrit (HTC), hemoglobin, blood cell counts and their indices with the degree of obesity in pediatric population. A total of 249; -139 morbidly obese (MO), 82 healthy normal weight (NW) and 28 overweight (OW) children were included into the scope of the study. WHO BMI-for age percentiles were used to form age- and sexmatched groups. Informed consent forms and the Ethics Committee approval were obtained. Anthropometric measurements were performed. Hematological parameters were determined. Statistical analyses were performed using SPSS. The degree for statistical significance was p≤0.05. Significant differences (p=0.000) between waist-to-hip ratios and head-to- neck ratios (hnrs) of MO and NW children were detected. A significant difference between hnrs of OW and MO children (p=0.000) was observed. Red cell distribution width (RDW) was higher in OW children than NW group (p=0.030). Such finding couldn’t be detected between MO and NW groups. Increased RDW was prominent in OW children. The decrease in mean corpuscular hemoglobin concentration (MCHC) values in MO children was sharper than the values in OW children (p=0.006 vs p=0.042) compared to those in NW group. Statistically higher HTC levels were observed between MO-NW (p=0.014), but none between OW-NW. Though the cause-effect relationship between obesity and erythrocyte indices still needs further investigation, alterations in RDW, HTC, MCHC during obesity may be of significance in the early life.
The 3D body movement signals captured during human-human conversation include clues not only to the content of people’s communication but also to their culture and personality. This paper is concerned with automatic extraction of this information from body movement signals. For the purpose of this research, we collected a novel corpus from 27 subjects, arranged them into groups according to their culture. We arranged each group into pairs and each pair communicated with each other about different topics. A state-of-art recognition system is applied to the problems of person, culture, and topic recognition. We borrowed modeling, classification, and normalization techniques from speech recognition. We used Gaussian Mixture Modeling (GMM) as the main technique for building our three systems, obtaining 77.78%, 55.47%, and 39.06% from the person, culture, and topic recognition systems respectively. In addition, we combined the above GMM systems with Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and 40.63% accuracy for person, culture, and topic recognition respectively. Although direct comparison among these three recognition systems is difficult, it seems that our person recognition system performs best for both GMM and GMM-SVM, suggesting that intersubject differences (i.e. subject’s personality traits) are a major source of variation. When removing these traits from culture and topic recognition systems using the Nuisance Attribute Projection (NAP) and the Intersession Variability Compensation (ISVC) techniques, we obtained 73.44% and 46.09% accuracy from culture and topic recognition systems respectively.
The growth in the volume of text data such as books and articles in libraries for centuries has imposed to establish effective mechanisms to locate them. Early techniques such as abstraction, indexing and the use of classification categories have marked the birth of a new field of research called "Information Retrieval". Information Retrieval (IR) can be defined as the task of defining models and systems whose purpose is to facilitate access to a set of documents in electronic form (corpus) to allow a user to find the relevant ones for him, that is to say, the contents which matches with the information needs of the user. Most of the models of information retrieval use a specific data structure to index a corpus which is called "inverted file" or "reverse index". This inverted file collects information on all terms over the corpus documents specifying the identifiers of documents that contain the term in question, the frequency of each term in the documents of the corpus, the positions of the occurrences of the word... In this paper we use an oriented object database (db4o) instead of the inverted file, that is to say, instead to search a term in the inverted file, we will search it in the db4o database. The purpose of this work is to make a comparative study to see if the oriented object databases may be competing for the inverse index in terms of access speed and resource consumption using a large volume of data.
Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.
Software Architecture plays a key role in software development but absence of formal description of Software Architecture causes different impede in software development. To cope with these difficulties, ontology has been used as artifact. This paper proposes ontology for Software Architectural design based on IEEE model for architecture description and Kruchten 4+1 model for viewpoints classification. For categorization of style and views, ISO/IEC 42010 has been used. Corpus method has been used to evaluate ontology. The main aim of the proposed ontology is to classify and locate Software Architectural design information.
Corpus luteum cross sectional (by ultrasonography) and plasma progesterone (by DELFIA) were estimated in early pregnant and non pregnant cows on days 14th and 20th to 23rd post insemination. On day 14th, corpus luteum sectional area was 348.43 mm2 in pregnant and 387.84mm2 in non pregnant cows. Within days 20th to 23rd, corpus luteum sectional area ranged between 342.06 and 367.90 mm2 in pregnant and between 193.85 and 270.69 mm2 in non pregnant cows. Plasma progesterone level was 2.43 ng/ml in pregnant and 2.46 ng/ml in non pregnant cows on day 14th, while during days 20th to 23rd the level ranged between 2.47 and 2.84 ng/ml in pregnant and between 0.53 and 1.17 ng/ml in non pregnant cows. Results of both luteal tissue areas as well as plasma progesterone levels were highly significantly deferent (P<0.01) between pregnant and non pregnant cows during days 20th to 23rd, but there were no significant differences on day 14th. The correlation between CL cross sectional area and plasma progesterone level was 0.4 in pregnant cows and 0.99 in non pregnant cow. It is clear, from this study, that ultrasonic assessment of corpora lutea is a viable alternative to determine plasma progesterone levels for early pregnancy diagnosis in cows.
In this paper, a new adaptive Fourier decomposition (AFD) based time-frequency speech analysis approach is proposed. Given the fact that the fundamental frequency of speech signals often undergo fluctuation, the classical short-time Fourier transform (STFT) based spectrogram analysis suffers from the difficulty of window size selection. AFD is a newly developed signal decomposition theory. It is designed to deal with time-varying non-stationary signals. Its outstanding characteristic is to provide instantaneous frequency for each decomposed component, so the time-frequency analysis becomes easier. Experiments are conducted based on the sample sentence in TIMIT Acoustic-Phonetic Continuous Speech Corpus. The results show that the AFD based time-frequency distribution outperforms the STFT based one.