Effettua una ricerca
Francesca Bianchi
Ruolo
Ricercatore
Organizzazione
Università del Salento
Dipartimento
Dipartimento di Studi Umanistici
Area Scientifica
AREA 10 - Scienze dell'antichita,filologico-letterarie e storico-artistiche
Settore Scientifico Disciplinare
L-LIN/12 - Lingua e Traduzione - Lingua Inglese
Settore ERC 1° livello
SH - Social sciences and humanities
Settore ERC 2° livello
SH4 The Human Mind and Its Complexity: Cognitive science, psychology, linguistics, philosophy of mind
Settore ERC 3° livello
SH4_9 Theoretical linguistics; computational linguistics
According to Dann, in current Western society, tourist choices are determined by personality, lifestyle, tourist-role, social class, and culture – the latter intended as “systems of beliefs, norms, values and sanctions, which ultimately guide [people’s] behaviour” (Dann, 1993: 105). Others, such as Pizam and Sussmann (1995: 904), point out that several studies based on direct or indirect methods of assessment suggest that nationality does influence tourist behavior, although it should certainly not be considered as the only factor. Starting from the belief that natural language is a mirror of national culture, this paper explores the working hypothesis that tourist preferences based on national culture can surface from the semantic analysis and comparison of large corpora revolving around ‘tourism’ in different languages. To this aim, three general Web corpora in different languages (British English, Italian, and Russian) – created by members of the Web-As-Corpus group – were used to extract three sub-corpora of 10,000 full sentences revolving around the node word tourism (turismo in Italian, and туризм in Russian). For each language, concordances of the node words were generated, sorted, and manually analysed, in order to highlight the linguistic labels for types of tourism. The labels thus identified were then grouped according to a hierarchical classification that organized labels into semantic fields and semantic fields into conceptual domains. The resulting classifications were compared at all of the three semantic levels, the contrastive approach being a fundamental element in order to identify cultural specificities. The present cross-linguistic comparison showed traces of globalisation, along with marked cultural specificities. Globalisation emerged when the same semantic fields (or sometimes labels) appeared in all the three corpora. On the other hand, the presence/absence of a specific label or field in a given corpus, or a marked difference in the number of labels within a given semantic field were considered indicators of cultural specificities on condition that a suitable explanation could be found in the history or traditions of the given culture or support came from evidence from other sources. In practice, cultural differences seemed to be frequently determined by the geographical, natural and historical situation of the country. Almost certainly, information of this type could also be retrieved in other ways, such as from expert informants for the given cultures, or from scientific or popular literature on the field. This, however, does not lessen the potential relevance of the current study, as it offers a further – and to our knowledge not yet applied method – for reaching the desired goal. Furthermore, the much wider number of labels retrieved in this study, compared to the number of labels found in dedicated and published classifications of tourism, testifies to the advantages of the use of corpora.
Culture, corpora and semantics is a methodological investigation in the use of elicited data and Web data in the analysis of cultural specificities starting from semantic elements. After considering and discussing several theoretical and analytical approaches to culture in linguistics, anthropology, psychology, and marketing research, a specifically developed method of analysis and cross-cultural comparison is applied to elicited data on chocolate and wine, gathered through free sentence-completion and sentence-writing tests on English and Italian respondents. The results obtained are discussed within the framework of cultural systems theories and used as control reference for further methodological investigations. In particular, the elicited data are qualitatively and quantitatively compared to non-elicited sentences on chocolate and wine from general Web corpora in English and Italian. Furthermore, in order to find an alternative route which could avoid the complex and time-consuming process of manually coding a large dataset, some alternative routes are tested, based on the creation of sub-corpora using sampling procedures and analysis of a limited number of the most frequent words in the dataset’s wordlist. Finally, an automatic semantic tagger is tested on the elicited data, in order to assess the extent of its possible application in cultural analysis. Comparisons between the Web corpora and the elicited data suggest that large general Web corpora can be considered representative of the cultural associations to a node word and could thus be used in cultural analysis or in exploratory marketing research. Finally, in the light of the results of the various methodological tests, the work discusses general issues, such as the relationship between word frequency and cultural relevance, and tagset granularity. The analysis of the two words in British English – chocolate and wine – and their denotationally comparable terms in Italian (cioccolato/a, cioccolatino/i, and vino/i) provides the opportunity to test different types of data, sampling procedures, coding methods, and a set of cultural theories in the identification of the cultural associations of those terms. As the subtitle of the book clarifies, the goal of the present work is methodological, namely the development of a viable corpus linguistics method for distinguishing cultural associations of a given word from personal mental associations. To this end, an interdisciplinary approach was adopted. The theoretical framework for this work draws on several disciplines that study culture through language, though from different perspectives, namely corpus linguistics, cultural studies, marketing, anthropology and psychology, with a focus on their shared elements relevant to the goal of the present research. This was considered necessary in order to make the method applicable outside linguistics. However, the book presents a linguistic piece of research and addresses a perspective audience of linguists. The work accomplishes two main goals. First, from a cultural perspective, it selects a cultural framework – cultural systems theories – that lends itself to computational semantic analysis, and develops a computational procedure for distinguishing the mental associations anchored in culture from those which are not. Second, from a methodological perspective, the quantitative comparisons performed between the entire datasets (both elicited and Web-based) on the one hand, and smaller samples of the data on the other, show, in this particular context, to what extent findings based on smaller data samples are generalisable to the whole database the samples come from, thus adding useful pieces of information to our general knowledge in corpus linguistics. In sum, this book, makes a foray into a multidisciplinary approach to the study of corpora, culture and semantics and provides researchers involved in (cross)cultural analysis
This paper reports on our research to generate multilingual semantic lexical resources and develop multilingual semantic annotation software, which assigns each word in running text to a semantic category based on a lexical semantic classification scheme. Such tools have an important role in developing intelligent multilingual NLP, text mining and ICT systems. In this work, we aim to extend an existing English semantic annotation tool to cover a range of languages, namely Italian, Chinese and Brazilian Portuguese, by bootstrapping new semantic lexical resources via automatically translating existing English semantic lexicons into these languages. We used a set of bilingual dictionaries and word lists for this purpose. In our experiment, with minor manual improvement of the automatically generated semantic lexicons, the prototype tools based on the new lexicons achieved an average lexical coverage of 79.86% and an average annotation precision of 71.42% (if only precise annotations are considered) or 84.64% (if partially correct annotations are included) on the three languages. Our experiment demonstrates that it is feasible to rapidly develop prototype semantic annotation tools for new languages by automatically bootstrapping new semantic lexicons based on existing ones.
Translating humour is notoriously a tricky task and requires a good deal of creativity. If seen as a problem-solving activity, creativity can be taught and developed. Bianchi (2012) sketches a problem-solving method based on a search for abstract categories emerging from the analysis of the most prominent linguistic features of the source text. The current paper develops the teaching idea in Bianchi (2012) into a problem-solving procedure and provides examples using humorous lines from an animated film. In particular, creative translation has been shown to develop over four cyclical stages: the conscious reading and analysis of the text (Preparation); an unconscious stage where ideas are reorganized (Incubation); the elaboration of possible solutions (Illumination); and conscious selection of the most adequate translation (Evaluation/Verification). Divergent thinking and fluency of thinking – i.e. the ability to produce a large number of thoughts and associations related to a given problem in a short time – are among the most prominent cognitive skills needed in this type of creative activity. The Stable Hyper-island Procedure (SHIP) – developed and illustrated in these pages through humorous lines from an animated film – subdivides the Preparation stage into four steps. Step 1 invites to a careful analysis of the source text. Step 2 suggests reorganizing the textual features into a lower number of abstract categories (hyper-islands) which will move the student’s attention away from the surface elements of the text – words and phrasing – and towards broader semantic, pragmatic and cultural issues. Step 3 guides the student to re-defining the original problem (i.e. translating the given lines of text) into a new, more-specific problem (e.g.: Create a marketing slogan to advertise a washing station for whales and which includes some kind of poetic cohesive pattern and word play). Step 4 fosters divergent thinking and fluency of thinking by brainstorming. The procedure then continues by explicitly inviting the students to produce as many alternative solutions to the newly defined problem as possible (Step 5), and by inviting them to select the solution that is most suitable in the given context (Step 6, Evaluation/Verification stage). All the steps are equally important, but the SHIP takes its name from Step 2, since the newly established abstract islands are a key element on which all the subsequent steps are taken. The hyper-islands, which favour divergent thinking and provide stable references for the Illumination and Evaluation phases, should be identified keeping in mind the pragmalinguistic and cultural context, as well as the skopos of the source text. Abstraction may take place at several levels, including semantics (passing from hyponym to hypernym), morpho-syntax (e.g.: from a given derived word to the general category of derivation), or phonetics (e.g. from a specific rhyme or assonance to the category itself, or higher, e.g. poetic device). In any case, it is a fundamental step without which it would be extremely difficult to re-define the original problem into a new one that is more precise, and at the same time open to several different solutions. Finally, in the procedure’s name, hyper-islands are defined as stable, since they represent a bridge between the source and target texts, by suggesting a new level of equivalence. As such, they mediate between free and faithful translation, two apparently incongruous elements that frequently guide external judgements of translation success.
L’integrazione dei migranti nel paese ospite è facilitata da una buona conoscenza della sua cultura e della sua lingua. Dal suo canto, il mediatore, per poter agire adeguatamente nel suo delicato ruolo, deve avere un’ottima conoscenza di entrambe le lingue e di entrambe le culture, tanto a livello di vita quotidiana quanto in ambiti specifici, quali quello medico, scolastico, amministrativo, e giuridico, a seconda della circostanza in cui si trova ad operare. In questa sede si illustra come i prodotti cinetelevisivi rappresentino uno strumento didattico e di (auto)apprendimento ricco, potente e flessibile per l’acquisizione di competenze linguistiche e culturali che ben sembra adattarsi alle esigenze dei mediatori quanto dei migranti. Questo lavoro quindi presenterà – tramite una rassegna selezionata della letteratura esistente – le potenzialità dei prodotti cinetelevisivi nell’insegnamento di contenuti culturali (Sezione 3) e nella didattica della lingua straniera (Sezione 4), con uno sguardo in particolare alle esigenze dell’autoapprendimento. Tuttavia per una completa e corretta fruizione di questi prodotti in un’ottica educativa è indispensabile essere consapevoli delle peculiarità del linguaggio filmico, in tutte le sue componenti, che verranno pertanto delineate in maniera sistematica in via preliminare (Sezione 2). L’intento di questo contributo è principalmente quello di rendere il mediatore culturale uno spettatore esperto e consapevole, capace di trarre il massimo profitto in termini di apprendimento linguistico e culturale dai prodotti cinetelevisivi e capace, all’occorrenza, di indirizzare e guidare l’immigrato nell’uso di questo potente strumento di (auto)apprendimento e integrazione.
The aim of the current experiment was to test the teaching and research potential of interactive features of selection, deselection, tagging and logging in the analysis of reading-comprehension processes. To this aim, LearnWeb – an interactive platform integrating TED talks – was used to involve 25 Italian MA students of consecutive interpreting in analytical tasks gauging their reading-comprehension abilities in English. Their selections, deselections, and annotations were automatically collected by the system and manually analysed by the researchers. The analyses provided an answer to the following research questions: Was any of the tasks perceived as difficult by the students? How was each task faced by the students? How did the logs contribute to understanding the students’ approaches to the tasks? The types of exercises used fit a large range of learning scenarios, and the resources, analytical methods and results described in this paper may be relevant to anyone interested in discourse comprehension.
The current paper describes a first experiment in the use of TED talks and open tagging exercises to train higher-level comprehension skills, and of automatic logging of the student’s actions to investigate the student choices while performing analytical tasks. The experiment took advantage of an interactive learning platform – LearnWeb – that integrates TED talk videos and transcripts and enriches them with tagging features and a data logging system. The data collected offered an answer to the following questions: Which of the three tasks was perceived by the students as more difficult? How was each task faced by the students? How did the logs contribute to an understanding of the students’ approaches to the tasks? The experiment also suggested ideas for further development of LearnWeb’s log features from a pedagogical and research perspective.
The past decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexical knowledge resources to cover more languages, such as the EuroWordNet and the Global WordNet. In this paper, we report on the construction of large-scale multilingual semantic lexicons for twelve languages, which employ the unified Lancaster USAS semantic annotation and provide a multilingual lexical knowledge base for the USAS automatic semantic annotation system. Our work contributes towards the goal of constructing larger-scale and higher-quality multilingual semantic lexical resources and developing corpus annotation tools based on them. Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them. Our evaluation shows that some semantic lexicons such as those for Finnish and Italian have achieved lexical coverage of over 90% while others need further expansion.
In traduzione la creatività può essere descritta come un tipo di attività di problem-solving applicato a problemi di carattere solitamente aperto, scandita in quattro momenti ciclici: preparazione (fase conscia di comprensione e analisi del testo), incubazione (fase inconscia di riorganizzazione delle idee), illuminazione (elaborazione di possibili soluzioni) e valutazione (scelta della soluzione traduttiva più idonea). Tra le principali abilità cognitive implicate nel processo creativo troviamo il pensiero divergente e l’abilità di produrre in breve tempo numerosi pensieri differenti e associazioni di idee in relazione a un singolo problema dato (‘fluidità di pensiero’). In questa sede si suggerisce che, per favorire lo sviluppo di abilità creative negli studenti di traduzione, la didattica della traduzione dovrebbe mettere in evidenza le fasi del processo creativo e offrire metodi e strumenti in grado di guidare il traduttore verso la risoluzione del problema. In particolare si suggerisce come metodo primario di lavoro la ricerca di ‘isole di stabilità’, intese come categorie astratte derivanti dai principali elementi testuali riscontrati nell’analisi del testo di partenza. L’identificazione di isole di stabilità tramite processi di astrazione, che completa la fase di preparazione, sposta l’attenzione dalla parola a categorie linguistiche-culturali (quali antonimia, specifiche aree semantiche o pragmatiche, figure retoriche). Così facendo, fornisce nuovi elementi che da un lato favoriscono il pensiero divergente e dall’altro creano dei punti di riferimento per le fasi di illuminazione e di valutazione. Infine, fungendo da ponte tra il testo di partenza e quello di arrivo, le isole di stabilità rappresentano un momento di mediazione tra fedeltà al testo di partenza e libertà nella traduzione, elementi su cui viene spesso basato il giudizio esterno.
L'articolo considera un particolare filone di cartoni animati nel quale il mondo umano si fonde parodisticamente con quello degli animali, degli oggetti o della fantasia a creare un mondo nuovo, con un forte effetto comico e autoironico ed esegue un'analisi linguistica di Shark Tale, uno degli esempi più creativi e più interessanti di questa nuova tendenza. Shark Tale, per la sua struttura, si presta particolarmente bene ad essere usato come esempio per parlare di creatività nell’ambito della multimedialità, della traduzione e della sottotitolazione L’analisi ha messo in evidenza le strategie creative adottate nella produzione del film Shark Tale e nella sua localizzazione per il mercato italiano. In entrambi i casi si distingue una macrostrategia di lavoro, all’interno della quale si declinano diverse microstrategie, spesso in interazione tra di loro. Nell’originale la macrostrategia si realizza con la costruzione di un mondo “nuovo”, punto di contatto tra l’oceano e il mondo umano. Per giungere a questo risultato sono state applicate le microstrategie descritte nella Sezione 2, ovvero interazione di tutti i livelli mediali, utilizzo di luoghi comuni e stereotipi, parallelismi più o meno forzati tra mondo umano e animale, interazione immagine-voce dei personaggi e giochi linguistici. Nella versione italiana, invece, la macrostrategia di lavoro è rappresentata dall’applicazione di scelte di adattamento, perdita e compensazione finalizzate alla ricostruzione del un nuovo mondo. All’interno di questa macrostrategia si collocano scelte linguistiche creative che sfruttano le microstrategie descritte per l’inglese.
The aim of this study is to illustrate the discursive and promotional strategies that luxury tour operators use on Facebook. For this purpose, Facebook posts published on the official pages of luxury tour operators were compared with posts published by general or budget tour operators on their own pages. The posts were analysed both quantitatively and qualitatively in terms of their content, images used, and linguistic features. The results suggest that luxury tour operators use their official Facebook pages like a catalogue of destinations, whereas general and budget tour operators use their pages to engage clients in forms of (social) interaction and to create a community.
The current study aims to investigate the potential of subtitling (i.e. creating subtitles) as a means to teach/learn specialised content and a foreign language simultaneously and attempts to measure its impact by comparing creating subtitles to watching subtitled video. This was operationalized in the following research questions: Does creating subtitles help the acquisition of scientific content? Does creating subtitles help the acquisition of scientific vocabulary? How does creating subtitles compare to watching subtitled video? And, does creating subtitles increase the student’s interest in science? In order to answer these research questions, two experiments were carried out: a group of students created English and Italian subtitles for a set of short videos in English about chemistry and physics. Subsequently, some of the videos were shown to a different group of students, accompanied by English and/or Italian subtitles. All the students were tested on the contents and language in the videos. The students who created subtitles were assessed about seven days after completion of the work, while the students who watched ready-made subtitles were tested immediately after watching the video. The study showed that both activities (watching ready-made subtitles and creating subtitles) helped content understanding and language memorization. It also suggested that creating subtitles is probably a much more effective activity for language and content acquisition than watching subtitles. Finally, it showed that, though both activities increased students’ interest in science, creating subtitles increased the students’ interest to a higher extent.
This paper analyses the official English subtitles of Joe Wright’s Pride & Prejudice, considered as a possible source of insight into the themes of the film and the contribution of dialogues to the filmic product. The subtitles are considered as a corpus, and quantitative as well as qualitative analytical methods are applied. In particular, the following methodological steps are performed: automatic semantic tagging; extraction of key domains, an extension of the keywords concept; extraction of key POS tags, used to seek confirmation to a stylistic trend appearing in the analysis; and investigation of concordance lines. Several key domains are identified and discussed. Alongside providing empirical evidence to observation, the results show how much information can be conveyed by subtitles, and their concurrent role in the creation of the overall product. Specifically, this analysis showed that the dialogues in Joe Wright’s film version of Pride & Prejudice perform two fundamental roles: they underline interpersonal relations and family ties, and even more prominently describe society and manners. Some core themes are common to both personal and social levels of life. These are: advantage; judgment; happiness; and hope/expectations. Love pertains only to the personal sphere and is expressed as romantic love or motherly/sisterly affection. Social life is depicted in terms of social manners, social obligations, social activities, social respect, and social judgment. Furthermore, action is not a dominant element in the dialogues and, when mentioned, it is always socially oriented: dancing and playing the piano, visiting people, or talking to or about people. Finally, the concept of pride emerged among the key domains and, interestingly, it is expressed through a range of terms, including vanity, arrogance, conceit, snob, pompous, boasts, selfish, as well as pride and proud. Such lexical richness illustrates pride in its several semantic components. As an adjunct, semantic tagging highlighted that these dialogues are unusually rich in adjectives and adverbs expressing the speakers’ degree of appreciation or certainty.
LearnWeb is an educational Web platform specifically designed for searching, collecting, sharing, and analysing open-access multimedia resources, through individual or collaborative activities. Among the many resources accessible through the platform there are TED talks, an open set of videos with multilingual transcripts that are gaining momentum as multimedia teaching resources. An advantage of accessing TED talks and transcripts in LearnWeb derives from the availability of extra interactive features specifically designed to support learning. More specifically, the students can highlight a word or part of a sentence in the transcript and tag it with an open annotation; furthermore, when the mouse passes over a highlighted word, the system automatically shows a set of definitions and synonyms for that word, taken from WordNet. Finally, if the students are not happy with their selection or tag, they can delete it and make a different one. The students’ activities (their selections, tags, deletions, etc.) are logged, and log files can be accessed by the teachers for research purposes. This paper illustrates the potential (and limits of the current version) of LearnWeb and its TED-related features for the teaching/learning of higher-level comprehension skills in academic curricula, by describing its use in a module on interpreting. In fact, the selecting and tagging features available on TED transcripts were used to help the students develop skills such as understanding discourse structure, distinguishing key elements of discourse from exemplifications and peripheral elements, and identifying speech acts. Furthermore, the logging feature allowed the teacher to collect precious information on the students’ choices and difficulties while performing those tasks. While exercises of this sort do not necessarily require the use of software, detailed monitoring of the students’ choices and mistakes would not be possible otherwise.
The current study offers a qualitative and quantitative analysis of the Italian subtitling of the narrator’s voice in two science documentaries for the general public. Specifically, it outlines the strategies used to translate the narrator’s spoken lines, identifies the linguistic elements that were manipulated, and suggests possible explanations for such manipulations. For each video, the Italian subtitles were first compared with the English audio. This comparison aimed to identify the subtitling strategies adopted in this particular type of video material. The subtitles were classified depending on the type of strategy applied. Furthermore, for each strategy the type of linguistic element involved was observed (e.g. modifier, adverb, downtoner, etc.). This two-layered analysis showed that while some of the instances of text manipulation corresponded with the well-known needs in subtitling of shortening and simplifying on the one hand and clarifying on the other, the remaining instances were a voluntary attempt to increase the level of formality of the text. Subsequently, in order to verify whether such a shift in the tenor of discourse simply depended on the shift in mode due to subtitling, where speech is rendered in the written form, the Italian subtitles were compared to the corresponding Italian dubbed lines. It was thus observed that the Italian dubbed version featured exactly the same strategies and linguistic devices as the subtitles. This led me to conclude that the observed shift in the tenor of discourse represents the translators’ attempts to adapt the text to Italian culture and that achieving greater formality should be considered a driving force in the subtitling of science documentaries from English into Italian, on a par with clarifying, simplifying and shortening.
The current paper is intended to provide a description of the linguistic and discourse strategies displayed by generalised and budget tour operators in Facebook, with particular emphasis on the techniques employed to promote destinations. To this aim, a corpus including 326 posts was created. The posts were analysed by means of corpus linguistics methods – including POS tagging, keyword analysis, and analysis of collocations –, while the images accompanying the posts were analysed within the framework of visual grammar theory. Keyword analysis showed that the posts under investigation, despite being written texts, are closer to spoken communication rather than written informal communication. The analyses also showed ample presence of linguistic and rhetorical techniques typical of tourism promotion. Furthermore, the analyses proved that the tourism operators considered are expert conversation managers who have developed a range of strategies to influence conversation. Finally, comparison between the current results and previous empirical studies suggest that promotional strategies and thus ‘the language of tourism’ varies not only from culture to culture, but also depending on text type (e.g. website vs. Facebook page), tourism service provider (e.g. hotel chains vs. tour operators), and target (e.g. luxury vs. non-luxury tour operator).
This study investigated the suitability of different methodological approaches to automatic semantic tagging in the analysis of cultural traits as they emerge from subjective meaning reactions to given words (EMUs). Elicited data from British native speakers were collected and coded manually and with an automatic tagging system (Wmatrix). The results of manual coding were then compared to the results offered by Wmatrix, at different levels and using a variety of methods. Furthermore, automatic tagging was applied to 10,000 sentences extracted from a general Web corpus and containing the node word, and the results of the Web corpus were compared to those of the elicited data. Though further investigation is needed, each of the experiments described provide interesting information for the definition of a method in the use of large corpora for the extraction of EMUs
Condividi questo sito sui social