Skip to content

Commit

Permalink
C++ modernization and boost removal (#182)
Browse files Browse the repository at this point in the history
* Bump boost to 1.83

* Switch to building as C++20

Co-authored-by: Ryan Lucia <[email protected]>

* Fix many warnings

* Rip out boost::locale and just use ICU directly

* Fix

* Replace boost::filesystem with std::filesystem

* Fix

* Addditional c++ modernization

* Use string_view all over the place and rip out more boost stuff

* Simply BOM handling in charset conversions

* libaegisub: fix dispatch types

* Fix

* Bump subprojects

* Reorganize tests build file

* Fix remaining compilation errors on Linux

* Rip out the last bits of boost::filesystem

* Remove remaining uses of boost string joins

* Fix some errors introduced with refactors

* Revert "Simply BOM handling in charset conversions"

This reverts commit 2e6b26d.
Taking out the BOM handling broke tests, so it'll probably break more
stuff.

* Bring back IconvWrapper::RequiredBufferSize

This partially reverts 62befa9 .
This function is actually used in charset_conv_win.cpp

* Fix ifind after moving to ICU

The previous logic didn't check if the match was on parts of
decomposed characters, so it also failed the corresponding test.

* Remove incorrect karaoke_matcher test

This was clearly incorrect and probably just unfinished.

* Remove leftover boost::locale code

* Move iconv include to charset_conv.h

On newer mac sdks iconv_t is defined differently, so it's harder to
just have a typedef for it.

* Fix compilation on arm64 mac

wx uses a different string implementation here, and utf8_string()
doesn't exist there.

* Fix luajit dependency in luabins project

Since luajit always first tried using dependency(), further calls
of dependency() will also always return system luajit.
meson.override_dependency() won't work.
This makes luabins link system luajit where it's available while aegisub
itself uses the subproject's luajit, which causes all kinds of fun
issues and definitely didn't baffle me four hours...

The added solution for this is horribly ugly (and also has problems when
reconfiguring...) but it's the only one I found that works. Maybe it's
better to always require interal luajit after all, or make the user
choose with a meson option?

* Fix locale initialization

Previously this would fail on startup because the automation menu
uses boost::locale::comparator.
... Or maybe the locale init change should just be reverted entirely?
Or it should be something different? I don't really know.

* Revert "Fix luajit dependency in luabins project"

This reverts commit 340fb9c.

* Fix luajit dependency in luabins project, take 2

Thinking about it some more, just copying the detection logic is
probably the lesser evil here.

* Fix agi::split_iterator after refactor

is_end being removed caused it to not output an empty segment at the
end if the input ends with a delimiter, but existing usages relied
on it doing that.

* Fix style parsing after refactor

* Fix tons of implicit this captures

* Enable CI to test

* Update deprecated hunspell usage

* Fix tests compilation on mac

* Make sure wx subproject builds with c++14

* Fix compilation on Windows

* Revert "Bring back IconvWrapper::RequiredBufferSize"

This reverts commit 04f4b26.

* Pin libass wrap for now

Apparently dependency('iconv') breaks when iconv is overridden??

* Fix compilation with wx 3.0

* Fix startup crash on Windows

windows.h was defining the ERROR macro, which shadowed the
DialogueTokenType enum variant, which broke the lexer construction.

* Fix SplitText ICU logic

Include UBRK_WORD_IDEO and check the entire rules vec. This now matches
the logic of boost::locale.

* Add test for character_count with \N and friends

* Fix ass_dialogue parsing after refactor

* Revert "Pin libass wrap for now"

This reverts commit 3802bb7.

* Remove iconv's stdbool.h

This was breaking things (libass) and doesn't seem to be
needed any more.

* Revert changes to to_wx

These broke some things, in particular FromUTF8Unchecked seems to not
like empty strings. Probably safer to just revert.

* Fix kara replacer after refactor

* Fix karaoke timing mode after refactor

* Revert "Enable CI to test"

This reverts commit 256cbeb.

---------

Co-authored-by: Ryan Lucia <[email protected]>
Co-authored-by: Thomas Goyne <[email protected]>
  • Loading branch information
3 people authored Dec 3, 2023
1 parent 9a2fdb9 commit 0b40de8
Show file tree
Hide file tree
Showing 314 changed files with 3,797 additions and 3,845 deletions.
6 changes: 3 additions & 3 deletions automation/tests/aegisub.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@

#include <libaegisub/dispatch.h>
#include <libaegisub/log.h>
#include <libaegisub/util.h>

#include <boost/locale/generator.hpp>
#include <cstdio>
#include <cstdlib>

Expand All @@ -36,7 +36,7 @@ void check(lua_State *L, int status) {
}

int close_and_exit(lua_State *L) {
int status = lua_tointeger(L, 1);
int status = (int)lua_tointeger(L, 1);
lua_close(L);
exit(status);
}
Expand All @@ -48,7 +48,7 @@ int main(int argc, char **argv) {
return 1;
}

std::locale::global(boost::locale::generator().generate(""));
agi::util::InitLocale();
agi::dispatch::Init([](agi::dispatch::Thunk f) { });
agi::log::log = new agi::log::LogSink;

Expand Down
65 changes: 41 additions & 24 deletions libaegisub/ass/dialogue_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,20 @@

#include "libaegisub/ass/dialogue_parser.h"

#include "libaegisub/exception.h"
#include "libaegisub/spellchecker.h"

#include <boost/locale/boundary/index.hpp>
#include <boost/locale/boundary/segment.hpp>
#include <boost/locale/boundary/types.hpp>
#include "libaegisub/unicode.h"

namespace {

typedef std::vector<agi::ass::DialogueToken> TokenVec;
using TokenVec = std::vector<agi::ass::DialogueToken>;
using namespace agi::ass;
namespace dt = DialogueTokenType;
namespace ss = SyntaxStyle;

class SyntaxHighlighter {
TokenVec ranges;
std::string const& text;
std::string_view text;
agi::SpellChecker *spellchecker;

void SetStyling(size_t len, int type) {
Expand All @@ -42,7 +40,7 @@ class SyntaxHighlighter {
}

public:
SyntaxHighlighter(std::string const& text, agi::SpellChecker *spellchecker)
SyntaxHighlighter(std::string_view text, agi::SpellChecker *spellchecker)
: text(text)
, spellchecker(spellchecker)
{ }
Expand Down Expand Up @@ -91,7 +89,7 @@ class SyntaxHighlighter {
};

class WordSplitter {
std::string const& text;
std::string_view text;
std::vector<DialogueToken> &tokens;
size_t pos = 0;

Expand All @@ -107,19 +105,39 @@ class WordSplitter {
}

void SplitText(size_t &i) {
using namespace boost::locale::boundary;
ssegment_index map(word, text.begin() + pos, text.begin() + pos + tokens[i].length);
for (auto const& segment : map) {
auto len = static_cast<size_t>(distance(begin(segment), end(segment)));
if (segment.rule() & word_letters)
SwitchTo(i, dt::WORD, len);
else
SwitchTo(i, dt::TEXT, len);
UErrorCode err = U_ZERO_ERROR;
thread_local std::unique_ptr<icu::BreakIterator> bi(icu::BreakIterator::createWordInstance(icu::Locale::getDefault(), err));
agi::UTextPtr ut(utext_openUTF8(nullptr, text.data() + pos, tokens[i].length, &err));
bi->setText(ut.get(), err);
if (U_FAILURE(err)) throw agi::InternalError(u_errorName(err));
size_t pos = 0;
while (bi->next() != UBRK_DONE) {
auto len = bi->current() - pos;

std::vector<int32_t> rules(8);
int n = bi->getRuleStatusVec(rules.data(), rules.size(), err);
if (err == U_BUFFER_OVERFLOW_ERROR) {
err = U_ZERO_ERROR;
bi->getRuleStatusVec(rules.data(), rules.size(), err);
}

if (U_FAILURE(err)) throw agi::InternalError(u_errorName(err));

auto token_type = dt::TEXT;

for (size_t i = 0; i < n; i++) {
if (rules[i] >= UBRK_WORD_LETTER && rules[i] < UBRK_WORD_IDEO_LIMIT) {
token_type = dt::WORD;
break;
}
}
SwitchTo(i, token_type, len);
pos = bi->current();
}
}

public:
WordSplitter(std::string const& text, std::vector<DialogueToken> &tokens)
WordSplitter(std::string_view text, std::vector<DialogueToken> &tokens)
: text(text)
, tokens(tokens)
{ }
Expand All @@ -137,14 +155,15 @@ class WordSplitter {
};
}

namespace agi {
namespace ass {
namespace agi::ass {

std::vector<DialogueToken> SyntaxHighlight(std::string const& text, std::vector<DialogueToken> const& tokens, SpellChecker *spellchecker) {
std::vector<DialogueToken> SyntaxHighlight(std::string_view text,
std::vector<DialogueToken> const& tokens,
SpellChecker *spellchecker) {
return SyntaxHighlighter(text, spellchecker).Highlight(tokens);
}

void MarkDrawings(std::string const& str, std::vector<DialogueToken> &tokens) {
void MarkDrawings(std::string_view str, std::vector<DialogueToken> &tokens) {
if (tokens.empty()) return;

size_t last_ovr_end = 0;
Expand Down Expand Up @@ -209,10 +228,8 @@ void MarkDrawings(std::string const& str, std::vector<DialogueToken> &tokens) {
}
}

void SplitWords(std::string const& str, std::vector<DialogueToken> &tokens) {
void SplitWords(std::string_view str, std::vector<DialogueToken> &tokens) {
MarkDrawings(str, tokens);
WordSplitter(str, tokens).SplitWords();
}

}
}
205 changes: 205 additions & 0 deletions libaegisub/ass/karaoke.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
// Copyright (c) 2022, Thomas Goyne <[email protected]>
//
// Permission to use, copy, modify, and distribute this software for any
// purpose with or without fee is hereby granted, provided that the above
// copyright notice and this permission notice appear in all copies.
//
// THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
// WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
// MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
// ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
// WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
// ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
// OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
//
// Aegisub Project http://www.aegisub.org/

#include <libaegisub/ass/karaoke.h>

#include <libaegisub/karaoke_matcher.h>
#include <libaegisub/format.h>

#include <boost/algorithm/string/predicate.hpp>
#include <boost/algorithm/string/trim.hpp>

namespace agi::ass {
std::string KaraokeSyllable::GetText(bool k_tag) const {
std::string ret;

if (k_tag)
ret = agi::format("{%s%d}", tag_type, ((duration + 5) / 10));

std::string_view sv = text;
size_t idx = 0;
for (auto const& ovr : ovr_tags) {
ret += sv.substr(idx, ovr.first - idx);
ret += ovr.second;
idx = ovr.first;
}
ret += sv.substr(idx);
return ret;
}

void Karaoke::SetLine(std::vector<KaraokeSyllable>&& syls, bool auto_split, std::optional<int> end_time) {
this->syls = std::move(syls);

if (end_time) {
Normalize(*end_time);
}

// Add karaoke splits at each space
if (auto_split && size() == 1) {
AutoSplit();
}

AnnounceSyllablesChanged();
}

void Karaoke::Normalize(int end_time) {
auto& last_syl = syls.back();
int last_end = last_syl.start_time + last_syl.duration;

// Total duration is shorter than the line length so just extend the last
// syllable; this has no effect on rendering but is easier to work with
if (last_end < end_time)
last_syl.duration += end_time - last_end;
else if (last_end > end_time) {
// Truncate any syllables that extend past the end of the line
for (auto& syl : syls) {
if (syl.start_time > end_time) {
syl.start_time = end_time;
syl.duration = 0;
}
else {
syl.duration = std::min(syl.duration, end_time - syl.start_time);
}
}
}
}

void Karaoke::AutoSplit() {
size_t pos;
while ((pos = syls.back().text.find(' ')) != std::string::npos)
DoAddSplit(syls.size() - 1, pos + 1);
}

std::string Karaoke::GetText() const {
std::string text;
text.reserve(size() * 10);

for (auto const& syl : syls)
text += syl.GetText(true);

return text;
}

std::string_view Karaoke::GetTagType() const {
return begin()->tag_type;
}

void Karaoke::SetTagType(std::string_view new_type) {
for (auto& syl : syls)
syl.tag_type = new_type;
}

void Karaoke::DoAddSplit(size_t syl_idx, size_t pos) {
syls.insert(syls.begin() + syl_idx + 1, KaraokeSyllable());
KaraokeSyllable &syl = syls[syl_idx];
KaraokeSyllable &new_syl = syls[syl_idx + 1];

// If the syl is empty or the user is adding a syllable past the last
// character then pos will be out of bounds. Doing this is a bit goofy,
// but it's sometimes required for complex karaoke scripts
if (pos < syl.text.size()) {
new_syl.text = syl.text.substr(pos);
syl.text = syl.text.substr(0, pos);
}

if (new_syl.text.empty())
new_syl.duration = 0;
else if (syl.text.empty()) {
new_syl.duration = syl.duration;
syl.duration = 0;
}
else {
new_syl.duration = (syl.duration * new_syl.text.size() / (syl.text.size() + new_syl.text.size()) + 5) / 10 * 10;
syl.duration -= new_syl.duration;
}

assert(syl.duration >= 0);

new_syl.start_time = syl.start_time + syl.duration;
new_syl.tag_type = syl.tag_type;

// Move all override tags after the split to the new syllable and fix the indices
size_t text_len = syl.text.size();
for (auto it = syl.ovr_tags.begin(); it != syl.ovr_tags.end(); ) {
if (it->first < text_len)
++it;
else {
new_syl.ovr_tags[it->first - text_len] = it->second;
syl.ovr_tags.erase(it++);
}
}
}

void Karaoke::AddSplit(size_t syl_idx, size_t pos) {
DoAddSplit(syl_idx, pos);
AnnounceSyllablesChanged();
}

void Karaoke::RemoveSplit(size_t syl_idx) {
// Don't allow removing the first syllable
if (syl_idx == 0) return;

KaraokeSyllable &syl = syls[syl_idx];
KaraokeSyllable &prev = syls[syl_idx - 1];

prev.duration += syl.duration;
for (auto const& tag : syl.ovr_tags)
prev.ovr_tags[tag.first + prev.text.size()] = tag.second;
prev.text += syl.text;

syls.erase(syls.begin() + syl_idx);

AnnounceSyllablesChanged();
}

void Karaoke::SetStartTime(size_t syl_idx, int time) {
// Don't allow moving the first syllable
if (syl_idx == 0) return;

KaraokeSyllable &syl = syls[syl_idx];
KaraokeSyllable &prev = syls[syl_idx - 1];

assert(time >= prev.start_time);
assert(time <= syl.start_time + syl.duration);

int delta = time - syl.start_time;
syl.start_time = time;
syl.duration -= delta;
prev.duration += delta;
}

void Karaoke::SetLineTimes(int start_time, int end_time) {
assert(end_time >= start_time);

size_t idx = 0;
// Chop off any portion of syllables starting before the new start_time
do {
int delta = start_time - syls[idx].start_time;
syls[idx].start_time = start_time;
syls[idx].duration = std::max(0, syls[idx].duration - delta);
} while (++idx < syls.size() && syls[idx].start_time < start_time);

// And truncate any syllables ending after the new end_time
idx = syls.size() - 1;
while (syls[idx].start_time > end_time) {
syls[idx].start_time = end_time;
syls[idx].duration = 0;
--idx;
}
syls[idx].duration = end_time - syls[idx].start_time;
}

} // namespace agi::ass
2 changes: 1 addition & 1 deletion libaegisub/ass/time.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
namespace agi {
Time::Time(int time) : time(util::mid(0, time, 10 * 60 * 60 * 1000 - 6)) { }

Time::Time(std::string const& text) {
Time::Time(std::string_view text) {
int after_decimal = -1;
int current = 0;
for (char c : text) {
Expand Down
4 changes: 2 additions & 2 deletions libaegisub/ass/uuencode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
// characters, and files with non-multiple-of-three lengths are padded with
// zero.

namespace agi { namespace ass {
namespace agi::ass {

std::string UUEncode(const char *begin, const char *end, bool insert_linebreaks) {
size_t size = std::distance(begin, end);
Expand Down Expand Up @@ -82,4 +82,4 @@ std::vector<char> UUDecode(const char *begin, const char *end) {

return ret;
}
} }
}
2 changes: 1 addition & 1 deletion libaegisub/audio/provider.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ class writer {
std::ostream& out;

public:
writer(agi::fs::path const& filename) : outfile(filename, true), out(outfile.Get()) { }
writer(std::filesystem::path const& filename) : outfile(filename, true), out(outfile.Get()) { }

template<int N>
void write(const char(&str)[N]) {
Expand Down
Loading

0 comments on commit 0b40de8

Please sign in to comment.