首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >从字符串中提取第一个和最后一个单词,如果与第三个单词匹配,则使用C#将其删除

从字符串中提取第一个和最后一个单词,如果与第三个单词匹配,则使用C#将其删除
EN

Stack Overflow用户
提问于 2018-03-16 20:44:33
回答 3查看 921关注 0票数 0

我的字符串是这样的:

代码语言:javascript
复制
string str = "Psppsp palm springs airport, 3400 e tahquitz canyon way, Palm springs, CA, US, 92262-6966 psppsp";

我单独获取字符串"psppsp“,需要将其与str中的第一个和最后一个单词进行比较,如果找到(第一个或最后一个单词),则需要将其从str中删除。

我需要知道同样的最好和最快的方法。

EN

回答 3

Stack Overflow用户

发布于 2018-03-16 21:46:08

最快的方法是O(n)。下面是一个代码的例子,它可以改进。

代码语言:javascript
复制
        string str = "Psppsp palm springs airport, 3400 e tahquitz canyon way, Palm springs, CA, US, 92262-6966 psppsp";
        string word = "psppsp";

        // Check if str and word are equals
        if (str == word)
        {
            str = "";
        }

        // Check Firt word in str
        if (str.Length > word.Length)
        {
            bool equal = true;
            for (int i = 0; i < word.Length; i++)
            {
                if (str[i] != word[i])
                {
                    equal = false;
                    break;
                }
            }
            if (equal && str[word.Length] == ' ')
            {
                str = str.Substring(word.Length);
            }
        }

        // Check Last word in str
        if (str.Length > word.Length)
        {
            bool equal = true;
            for (int i = word.Length - 1; i >= 0; i--)
            {
                if (str[str.Length - word.Length + i] != word[i])
                {
                    equal = false;
                    break;
                }
            }
            if (equal)
            {
                str = str.Substring(0, str.Length - word.Length);
            }
        }
票数 0
EN

Stack Overflow用户

发布于 2018-03-16 21:58:36

有几种方法可以做到这一点。以下是使用正则表达式的一种方法。您可以预编译正则表达式,如果您对许多字符串执行此操作,则会加快执行速度:

代码语言:javascript
复制
string str = "Psppsp palm springs airport, 3400 e tahquitz canyon way, Palm springs, CA, US, 92262-6966 psppsp";

string match = "psppsp";

// Build 2 re-usable regexes
string pattern1 = "^" + match + "\\s*";
string pattern2 = "\\s*" + match + "$";
Regex rgx1 = new Regex(pattern1, RegexOptions.Compiled | RegexOptions.IgnoreCase);
Regex rgx2 = new Regex(pattern2, RegexOptions.Compiled | RegexOptions.IgnoreCase);

// Apply the 2 regexes
str = rgx1.Replace(rgx2.Replace(str, ""), "");

如果匹配不可能出现在字符串的其他位置,则可以使用linq。这涉及到将split返回的数组转换为列表:

代码语言:javascript
复制
// Convert to list
var tempList = new List<string>(str.Split());
// Remove all occurences of match
tempList.RemoveAll(x => String.Compare(x, match, StringComparison.OrdinalIgnoreCase) == 0);
// Convert list back to string
str = String.Join(" ", tempList.ToArray());

或者,一种更简单的方法

代码语言:javascript
复制
if (str.StartsWith(match, StringComparison.InvariantCultureIgnoreCase)) {
    str = str.Substring(match.Length);
}
if (str.EndsWith(match, StringComparison.InvariantCultureIgnoreCase)) {
    str = str.Substring(0, str.Length - match.Length);
}
str = str.Trim();

不确定哪一个(如果有的话)是“最好的”。我喜欢最后一个。

票数 0
EN

Stack Overflow用户

发布于 2018-03-16 22:07:07

您可以使用str.StartsWith(x)、str.EndsWith(x)、str.Contains(x)、str.IndexOf(x)查找和定位搜索字符串,使用str.Substring(start,len)更改字符串。有很多种方法可以实现这种字符串操作,但您要求...

最好和最快:让我们使用一些完全安全的“不安全”代码,这样我们就可以使用指针了。

代码语言:javascript
复制
    // note this is an extension method so you need to include it in a static class
    public unsafe static string RemoveCaseInsensitive(this string source, string remove)
    {
        // convert to lower to enable case insensitive comparison
        string sourceLower = source.ToLower();

        // define working pointers
        int srcPos = 0;
        int srcLen = source.Length;
        int dstPos = 0;
        int rmvPos = 0;
        int rmvLen = remove.Length;

        // create char arrays to work with in the 'unsafe' code
        char[] destChar = new char[srcLen];

        fixed (char* srcPtr = source, srcLwrPtr = sourceLower, rmvPtr = remove, dstPtr = destChar)
        {
            // loop through each char in the source array
            while (srcPos < srcLen)
            {
                // copy the char and move dest position on
                *(dstPtr + dstPos) = *(srcPtr + srcPos);
                dstPos++;

                // compare source char to remove char
                // note we're comparing against the sourceLower but copying from source so that
                // a case insensitive remove preserves the rest of the string's original case
                if (*(srcLwrPtr + srcPos) == *(rmvPtr + rmvPos))
                {
                    rmvPos++;
                    if (rmvPos == rmvLen)
                    {
                        // if the whole string has been matched
                        // reverse dest position back by length of remove string
                        dstPos -= rmvPos;
                        rmvPos = 0;
                    }
                }
                else
                {
                    rmvPos = 0;
                }

                // move to next char in source
                srcPos++;
            }
        }

        // return the string
        return new string(destChar, 0, dstPos);
    }

用法:

代码语言:javascript
复制
str.RemoveCaseInsensitive("Psppsp"); // this will remove all instances throughout the string
str.RemoveCaseInsensitive("Psppsp "); // space included at the end so in your example will remove the first instance and trailing space.
str.RemoveCaseInsensitive(" psppsp"); // space included at the start so in your example will remove the final instance and leading space.

你可能会问,为什么要使用不安全的代码?在处理数组时,每次指向该数组中的元素时,都会执行边界检查。因此,str1、str2、str3等都有开销。因此,当您处理这种对数千个字符的检查时,它会累积起来。使用不安全代码可以使用指针直接访问内存。没有边界检查,或者任何其他减慢操作的东西。性能上的差异可能是巨大的。

作为性能差异的一个例子,我创建了两个版本。一个是使用标准字符串指针的安全的,另一个是不安全的。我通过递归地添加数千个要保留和删除的字符串副本来创建一个字符串。结果很明显,不安全版本的完成时间是安全版本的一半。除了安全与不安全之外,这些方法都是相同的。

代码语言:javascript
复制
public static class StringExtensions
{
    public unsafe static string RemoveUnsafe(this string source, string remove)
    {
        // convert to lower to enable case insensitive comparison
        string sourceLower = source.ToLower();

        // define working pointers
        int srcPos = 0;
        int srcLen = source.Length;
        int dstPos = 0;
        int rmvPos = 0;
        int rmvLen = remove.Length;

        // create char arrays to work with in the 'unsafe' code
        char[] destChar = new char[srcLen];

        fixed (char* srcPtr = source, srcLwrPtr = sourceLower, rmvPtr = remove, dstPtr = destChar)
        {
            // loop through each char in the source array
            while (srcPos < srcLen)
            {
                // copy the char and move dest position on
                *(dstPtr + dstPos) = *(srcPtr + srcPos);
                dstPos++;

                // compare source char to remove char
                // note we're comparing against the sourceLower but copying from source so that
                // a case insensitive remove preserves the rest of the string's original case
                if (*(srcLwrPtr + srcPos) == *(rmvPtr + rmvPos))
                {
                    rmvPos++;
                    if (rmvPos == rmvLen)
                    {
                        // if the whole string has been matched
                        // reverse dest position back by length of remove string
                        dstPos -= rmvPos;
                        rmvPos = 0;
                    }
                }
                else
                {
                    rmvPos = 0;
                }

                // move to next char in source
                srcPos++;
            }
        }

        // return the string
        return new string(destChar, 0, dstPos);
    }

    public static string RemoveSafe(this string source, string remove)
    {
        // convert to lower to enable case insensitive comparison
        string sourceLower = source.ToLower();
        string removeLower = remove.ToLower();

        // define working pointers
        int srcPos = 0;
        int srcLen = source.Length;
        int dstPos = 0;
        int rmvPos = 0;
        int rmvLen = remove.Length;

        // create char arrays to work with in the 'unsafe' code
        char[] destChar = new char[srcLen];

        // loop through each char in the source array
        while (srcPos < srcLen)
        {
            // copy the char and move dest position on
            destChar[dstPos] = source[srcPos];
            dstPos++;

            // compare source char to remove char
            // note we're comparing against the sourceLower but copying from source so that
            // a case insensitive remove preserves the rest of the string's original case
            if (sourceLower[srcPos] == removeLower[rmvPos])
            {
                rmvPos++;
                if (rmvPos == rmvLen)
                {
                    // if the whole string has been matched
                    // reverse dest position back by length of remove string
                    dstPos -= rmvPos;
                    rmvPos = 0;
                }
            }
            else
            {
                rmvPos = 0;
            }

            // move to next char in source
            srcPos++;
        }

        // return the string
        return new string(destChar, 0, dstPos);
    }
}

这是基准测试:

代码语言:javascript
复制
internal static class StringRemoveTests
{
    private static string CreateString()
    {
        string x = "xxxxxxxxxxxxxxxxxxxx";
        string y = "GoodBye";

        StringBuilder sb = new StringBuilder();

        for (int i = 0; i < 1000000; i++)
            sb.Append(i % 3 == 0 ? y : x);

        return sb.ToString();
    }

    private static int RunBenchMarkUnsafe()
    {
        string str = CreateString();
        DateTime start = DateTime.Now;
        string str2 = str.RemoveUnsafe("goodBYE");
        DateTime end = DateTime.Now;

        return (int)(end - start).TotalMilliseconds;
    }
    private static int RunBenchMarkSafe()
    {
        string str = CreateString();
        DateTime start = DateTime.Now;
        string str2 = str.RemoveSafe("goodBYE");
        DateTime end = DateTime.Now;

        return (int)(end - start).TotalMilliseconds;
    }

    public static void RunBenchmarks()
    {
        Console.WriteLine("Safe version: " + RunBenchMarkSafe());
        Console.WriteLine("Unsafe version: " + RunBenchMarkUnsafe());
    }
}

class Program
{
    static void Main(string[] args)
    {
        StringRemoveTests.RunBenchmarks();
        Console.ReadLine();
    }
}

输出:(结果以毫秒为单位)

代码语言:javascript
复制
// 1st run
Safe version: 569
Unsafe version: 260

// 2nd run
Safe version: 709
Unsafe version: 329

// 3rd run
Safe version: 486
Unsafe version: 279
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49321105

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档